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STAGE 1 : EXTRACTION OF 2-D MOTION i 
INFORMATION I 



STAGE 2: CALCULATION OF 3-D 
MOTION INFORft«ATION 



(54) Methods for creating an image for a three-dimensional display, for calculating depth 
Information, and for image processing using the depth information 

(57) A method is proposed for automaticaliy obtain- 
ing depth information from a 2-D motion Image, so as to 
create an image for a 3-D display. Further, methods are 
prqDosed for selecting appropriate frames for Ihe calcu- 
latbn of the depth information, or discontinuing the cal- 
culation, and for conducting image processing using the 
depth information Examples of various types of image 
processing can be listed, as including the creation of a 
viewfinder image seen from a different point, natural 
scaling operations to an image area, and separation of 
a desired Image area. Rrst. a motion informafion of an 
object on a screen is extracted by block matching or the 
111®. Second, the actual mo\^ement of the object in the 3- 
D world is calculated. Since the viewfinder image is a 
projection of a 3-D space, it is possible to obtain the 
original 3-D movement of the object, based on the 
movements of a plurality of representative points 
through an inverse transformation, the representative 
points b^ng provided in the viewfinder image Result- 
antly, 3-D coordinates of the object are identified, so 
that depth information of the object is obtained. After- 
wards, a parallax is calculated based on the depth infbr* 
mation. so as to create right and left eye images from 
the input viewfinder image. Alternatively, -inrage 
processing, such as separation of an object having a 
depth within a predetermined range, is carried out 
based on the depth irrformation. 
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Description 

BACKGROUND OF THE INVENTH3N 
5 Field of the Inverrtton: 

The present invention relates to methods for a eating an image for a three-dimensional display (he-einafter referred 
to as a 3-D display}, for calculating depth information, and for image processing using the depth information. The 
method for cr eating an image for a 3-D.display particularly relates to a me^od for creating a Pseudo-viewfinder image 
10 shot by a multi-eye camera (a stereo image) from a viewfinder image shot by a monocular camera. The method for cal- 
culating depth irrformation relates to a method for obtaining a distance between an object and a viewer, which is appli- 
cable to practice the method for creating an image for a 3-D disj^ay The method for image processing using the depth 
information relates to applications including the creation of an image for a 3-D display and the suspending of creation 
of an Image for a 3'D display, an &ihanced display, or the like. 

15 

Description of the Prior Art: 

[1] Creation of an Image for a 3-D Display 

20 In fields related to television techniques for creating a 3-D image (a pseudo stereoscopic vision) tiirough the detec- 
tion of a movement of a 2-D motion image have been known. One typical example of such a technique is a 3-D display 
employing a time difference method, the prindpal of which will be described with reference to Rgs 1 to 3. 

In a scene where an object moves from left to right while the background slays still, as shown in Rg. 1, by repro- 
ducing respective images for right and left eyes (hereinafter respectively referred to as right and left eye images) so as 

25 to have a predetermined lapse of time between them, as shown in Fig 2. a parallax 6 is caused, as shown in Rg. 3 "A 
parallax" or "a binocular disparity* is defined as an angular difference between sight vectors directed at one point from 
right and left eyes, respectively in Fig 1 . since a viewer perceives the car as being closer than the background due to 
parallax, a pseudo stereoscopic vision Is achieved When the object, the car in tiiis case, rrmes in the opposite direc- 
tion, respective images should be reproduced such that tiie one for a right eye is reproduced earlier than the one for a 

30 left eye by a f^edetermined time, contrary to the example shown in Fig 3. 

JP Publication No. Sho 55-36240 dtscioses a display apparatus for a stereoscopic image using depth information, 
in which the apparatus receives only an image signal shot from one basic direction (that is« a 2-D motion image) among 
signals from multiple directions and a s^nal containing tiie depth information of an object, so as to reproduce within tiie 
aipparatus the original viewfinder image shot from multiple directions The purpose of the apparatus is to reduce a trans- 

35 mission bandwidtii. The apparatus incorporates a variable delay circuit for causing a time delay while controlling tiie 
extent thereof according to deptii information TTie time delay results in a parallax. According to an output signal from 
the drcutt, image signals are reproduced for right and left eyes. In Uiis way. a pseudo stereoscopic image is displayed. 
This puk^ication discloses, as a preferred emtx)diment of the disclosed apparatus. (1) a device for displaying a pseudo 
stereoscopic image for a viewer by respectively supplying right and left eye images to two CRTs, which are situated 

4a forming a predetermined angle with a half mirror, and (2) a device for displaying a pseudo stereoscopic image for a 
viewer even if the viewer moves in a horizontal direction, using a lenticiJar lens fixed to a display screen. 

However, the atwave apparatus works on the condition that depth information is supplied externally. Therefore, if it 
only receives a 2-D motion image, the apparatus cannot create a pseudo stereoscopic imaga 

JP Application Lald-Open No. Hei 7-591 19 also discloses an apparatus for creating a pseudo stereoscopic irrage 

45 based on a 2*D motion image. The apparatus comprises a detection drcuif for detecting a motion vector from a supplied 
2-D motion image, and a delay circuit for delaying either a right or a left image according to the motion vector. The delay 
causes a parallax This application discloses, as a preferred embodiment of the disclosed apparatus, a head mounted 
display (HMD), which is a glasses type di^lay for supplying different images to right and left eyea Through the HMD, 
a viewer can see a pseudo stereoscopic Image. 

50 In this apparatus, however, since the extent of delay is deternnined according to the magnitude of a motion vector, 
any object moving at high speed appears to be closer to tiie viewer, resulting in an unnatural ster eoscopic view, which 
is discordant to the effective distance between the viewer (or the camera) and the object (that is. a d^h). 

JP Laid-open Application No. Sho 60-263594 also discloses an apparatus for displaying a pseudo stereoscopic 
image using a time difference method, in which tiie apparatus displays right and left images alternatively for every field. 

55 so tiiat they are seen alternatively via shutter glasses for every field, as tiie shutter glasses alternatively open their right 
and left eyes. This application further discloses a metiiod for generating a stereosccpic effect by providing a longer time 
difference between right and left images when an object moves at low speed However, since tills apparatus a!so does 
not operate based on depth information, it is not really possible for an accurate pseudo stereoscopic image to be cre- 
ated and thus displayed 
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"PIXEL** (No 128). a magazine, issued on May 1 , 1993 describes in pages 97 to 102 a pseudo stereoscxjpic image 
system using depth Information In the system, an object is first displayed as a gray-scale image where the gray-scale 
iBfel corresponds to the depth, and then based on the gray level, the appropriate parallax Is calculated in terms of the 
nun^er of pixels, so that right and left images are created to be seen via shutter glasses. However, the perspective 
5 image is manually created and a technique for automating the creation is not disclosed 

National Putriication Na Hei 4-504333 (WO88/04804) discloses a method for achieving a pseudo stereoscopic 
image using depth information The method comprises steps of dividing a 2-D motion image into some areas, for giving 
the divided areas depth information, so as to provide each of the areas with a parallax, and for creating a pseudo ster- 
eoscopic image. Howev^, the depth information is manually supplied and a technique for automating the supply is not 
10 disclosed. 

In a research field called "Computer Vision, a study has been conducted into a method for estimating a 3-D struc- 
ture and movement of an object Concretely speaking, the study, which is aimed at seH-control of a robot, relates to 
acquisition of an accurate distance from a viewpoint to an object by either shooting the object using a stereo camera (a 
multi-eye camera), or using a monocular caniera while moving it Several aspects of this technology are described in a 
IS report, ^titled '*1990 Pk;ture Cod^g Symposium of Japan (PCSJgO)," for example, on page 57. 

[2] Creation of DepUi Information 

Computer Vision would enable detection of the depth of an object. However, in the calcuiatlon of depth information, 
20 which is based on 2-D motion information, suitable images are not always suppfied for the calculation if the calculation 
is continued even with unsuitable images supplied, serious errors are likely to be caised That is, if depth information 
is obtained from such unsuitable images, and then used for the creation of a stereoscopic image, it may be quite likely 
that the thus created stereoscopic image will be unnatural, i.e., exhibiting such anomalies as a person in the distance 
appearing closer than a person who actually is closer. 
25 It is to be noted tfiat the idea of obtaining depth information through understarxiing of a corresponding relationship 
between frames has been known. For exanpie. JP Application Laid-Open No. Hei 7-71940 (which conresponds to 
USP5475422) mentions, as a prior art, (1 ) a technique for relating a point or a fine between two images shot by a stereo 
camera, so as to estimate the positfon of the point or One in actual space (the 3-D world), and (2) a technique for shoot- 
ing an object on a camera while moving it. so as to obtain its sequential viewfinder images for tracing the movements 
30 of a characteristic point on the sequential viewfinder images, and tiiereby estimating the position of each characta-istic 
point in actual space . 

[3] An Image Processing (Method Using Depth Information 

35 A method for controlling the movement of a robot using depth informatic»i is known, such as the foregoing Compu- 
ter Vision. A method for creating a pseudo stereoscopic image using depth Information is also known, such as is dis- 
closed in tiie foregoing JP Publication No. Sho 55-36240. 

On the other hand, a method for using depth information in image processing other than the creation of a pseudo 
stereoscopic image has scarcely i3een proposed. 

40 

SUMMARY QF THE INVENTION 

The first object of the present invention relates to the creation of an image for a 3-D display, as described in the 
: above [1]. In defining the object of the present invention, the inventor draws attention to the fact the all of the foregoing 
45 techniques for creating a pseudo stereoscopic image have at least one of the following prc^iems to be solved: 

1. An accurate stereoscopic image based on depth information Is not created Instead, a mere 3-D effect is provi- 
sionally created according to the extent of movement Further, since a parallax needs to be created using a delay 
in time (a time difference), horizontal movement of an object is required as a premise of the creation, which should 

so constitutes a fundamental restriction. 

2. As it is not automated, the process for obtaining depth information from a 2-D motion image requires an editing 
process. Thus, a stereoscopic image cannot be output in r^l time upon input of the 2-D motion image. 

55 Therefore, the first object of the present invention is to create an accurate stereoscopic image, based on depth 
information, by applying the foregoing technique relating to a conrputer vision to an image processing field including 
technical fields related to television. 

In order to achieve this object, according to the present invention, depth information is extracted from a 2-D motion 
image, based on which an image for a 3-D display is created This is applying a technique related to a computer vision 
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to a technical field relating to an image display According to one a^ect of the present invention, depth information is 
obtained through the following processes: that is, the movement of a 2-D motion image is detected; a relative 3-D move- 
ment between the scene and the shooting viewpoint of the 2-D motion image is calculated; and relative distances from 
the shooting viewpoint to the respective image parts of the 2-D motion Image are calculated, based on the relative 3-D 

5 movement and the movements of the respective image parts Based on the thus obtained depth information, a pseudo 
stereoscopic image is created 

This aspect of the present invention can be differOTlly descrtoed as a depth being obtained through the following 
processes: ^at is, a plurality of viewfinder frames (hereinafter referred to as frames) are selected from a 2-D motion 
image to be processed; and a r^ative positional relationship in tfie actual 3-D world of the respective innage parts is 

TO identified based on a 2-D positional displacement Betvfveen the frames In other words, in order to determine the depth. 
3'D movements of the respective image parts are calculated based on the 2-D positional displacement based on which 
positional coordinates of the respective image parts in the 3-D world are calculated, according to the principle of trian- 
gulation A frame is a unit for image processing, that is. a concept induding a frame picture or a field picture in MPEG, 
and the Gke 

75 Regarding a 2-D motion image, a plurality of viewfinder frames are hereinafter referred to as *dffferent-^ime 
frames." as they are shot at different times (In the following description of a multi-eye camera, a plurality of frames 
which are Simultaneously shot are referred to as **same-time frames ") A positional displacement on a frame plane is 
referred to as *^a 2-D positional displacement " In this aspect of the present invention, where different-time frames are 
discussed. *'a 2-D po^tional displacement ' means a change caused along v\nth a lapse of time, that is. a movement 

so (On the contrary, "a 2-D positional displacement" of same-time frames means a positional difference among a plurality 
of frames.) 

The second ot^ect of the |:»'esent invention r^ates to the calculation of depth information, as described in tiie above 
[2] . That is, the second object of the present irtvention is to propose a method for obtaining a correct corresponding rela- 
tionship among a plurality of images, so as to calculate accurate deptii information, for selecting an image to be input 
2S appropriate for the calculation, and for discontinuing the calculation of depth information when any inconvenience 
occurs, such as could cause an unnatural pseudo stereoscopic image to be created. Further, the present invention aims 
to prqaose methods ior effectively determining corresponding and characteristic points, arKi for searching and tracirig 
points with a high accuracy 

In order to achieve this ot^ect, according to the present inv^ion, two frames with appropriately large movements 

30 between them are selected from a 2-D motion image, so that depth information is obtained from the two frames. Accord- 
ing to this aspect of the invention, it is possible to obtain a good calculation result, witii pre-selection of frames which 
may facilitate the calculation- A judgement as to whether frames have appropriately large movement or not may be 
based on the extent or variance of nwvement of a characteristic point 

According to another aspect of the invention, with a representative point pn>vided in a reference frame, the similar- 

35 tty of images is evaluated between a image area inclucfing a characteristic point in the otiier frame (hereinafter referred 
to as an object frame), and an image area including the representative point In the reference frame. A characteristic 
point is a candidate for a corresponding point subject to an evaluation, ^e candidate being arbitrartly determined. Then, 
the relative positional acceptability between the characteristic point and the other characteristic point ts evaluated. That 
is. a judgement is made as to whetiier the relative positional relationship between the characteristic point and the other 

40 characteristic point is reasonable or acceptable with respect to being deterrrnned as the same as the relative positional 
relationship between the representative point and tiie other representative point respectively con^esponding to the 
characteristic points. When both evaluations result in a favorable score, the characteristic point Is tentativeiy determined 
as a corresponding point of the representative point Subsequently, a best point is searched for where each of the eval- 
uations yield tiie best result by moving one corresponding point within a predetermined search area, while assuming 

45 that an the otiier corresponding points are fixed (this metiiod hereinafter being referred to as a fixed searching method). 
The best position, which has been found during the seaidi.'is determined as a new position of the corresponding point 
All corresponding points are sequentially subjected to this search and the positional change. Aftennfards. depth Infor- 
mation is obtained based on a positional relationship between a representative point and Hs conesponding point the 
corresponding point having been obtained through the above mentioned series of processes 

50 Conventionally, the sin^larity of images has been evaluated by block matching or the like. In this invention, on the 
other hand, by including an additional evaluation on the rdative positional evaluation, the corresponding relationship 
between frames can be more accurately detected. The accuracy can be further inproved through iterative calculations. 

According to one aspect of the present invention, the similarity of the Images is evaluated by bbck matching which 
is modified such that the sirrularity is correctly evaluated to be highest when the blocks including the identical object are 

55 tested, regardless of shooting conditions (hereinafter refened to as biased block matching) As to same-time frames, a 
certain color deflection tends to occur due to characteristic differences of a plurality of cameras. As to different-time 
frames, the same proi^em will arise due to changing weatiierfrom time to time, as this causes a change in brightness 
of a viewfinder image. After correction is made to solve such problems, the similarity of images is transformed to be 
expressed in the from of a geometrical distance, which is a concept for judging the acceptability of relative positions. 
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Then, the relalK/e positional acceptability and the transformed similarity are combined together to be used for a general 
judgement on the evaluation results In this case, biased block matcl^ng may be conducted within a correction finvta- 
tion, which is pre-determined depending on a distance between the reference and object frames. That is, when the dis- 
tance is larger, a larger correction limitation is set accordingly. 
£ A correction for off-setting a change in brightness is disclosed in JP Laid-Open No. Hei 5-3086630. However, the 

correction is applicable only to cases of iacing-out or facing-tn (a consistent changing in brightness), but not to a partial 
changing in brightness 

According to another aspect of frie invention, depth information is obtained through the following processes: that is, 
a plurality of representative points are provided in a reference frame; a plurality of corresponding points of the r^re- 

10 sentative points are determined in an object frame, so that each corresponds to a respective one of \he representative 
points; and a positional relationship between at least a characteristic point among the representative points, and its cor- 
respondrng pcsnts is obtained As a characteristic point, a point whose position moves steadily among a plurality of dif- 
ferent-time frames is selected, because such a point Is considered to be accurately traced. 

Likewise, according to another aspect of the present Invention, If a point, whose displacement between same-time 

15 frames is subslantially consistent or changes substantially consistently, also shows similarly consistent moverhent or 
change in movement between other same-time frames shot at a close but different time from the above, such a point 
may be selected as a characteristic point 

According to a further aspect of the present invention, depth information is obtained from the following processes: 
that is. a plurality of representative points are provided in a refererxse image; a plurality of corresponding points of the 

so representative points are determined in the other image; and a positional relationship t?etween the representative point 
and its corresponding point is obtained; depth information is calculated according to the positional relationship, in which 
the calculation of the depth information is discontinued w/hen an insufficient number of characteristic points are estab- 
lished among the representative points or the movements of the characteristic points are too small, because it is then 
very unlikely that a positional relationshq^ between viewfinder Images will be cdstained with a high accuracy. 

25 Tvw> conceptually different corresponding points exist, that is. a true corresponding point and a computed corre- 
sponding point- In principle, each representative point has a sole corresponding point, elimlna^g the possibility of the 
existence of any other corresponding points in any other positions This Idealistic sole corresponding point is the true 
corresponding point. On the other tend, a corre^ndlng point determined through calculations for image processing 
does not necessarily coindde with the true corresponding point. This is the computed corresponding point, which may 

3D possibly exist in any positions other than that of the true corresponding point, and change its position arbitrarily The 
positional change may be resorted In a process for inprovlng the accuracy of the corresponding point, as described 
later In this specification, the term "corre^onding point" is used to Include both true and computed corresponding 
points without a distinction between the two concepts, unless it is necessary to differentiate them- 

According to a further aspect of the present invention, a depth of a 2-D image is otatalned. in which when frie depth 

35 of any point in a certain image is calculated as negative, tfie depth is recalculated while referring to the depth informa- 
tion of points dose-by with a positive depth value That is. when a depth is calculated as negative, that is probably 
because of unsuitable variables toeing used during the calculation. Therefore, such a negative depth should be cor- 
rected based on the depth of a close point 

The third object of the present invention relates to the above [3], that is, utilization of depth information in image 

40 processing other than the creation of a pseudo stereoscofwc image. 

In order to achieve this objiect. according to the present invention, in creating a stereo image by giving a parallax to 
a 2-D image according to Its depth infomiation. the parallax is first changed so as to fall within a predetermined range, 
so that the stereo Image will be created according to the changed depth Informatioru An excessively large parallax 
would cause fatigue on a hewer's eyes. On the contrary, an excesavely small par€dlax would invalidate the meaning of 

45 a parallax as data. Therefore, it is necessary to keep a parallax within a desired range. 

According to another aspect of the invention, in creating a stereo image by giving a parallax to a 2-D image accord- 
ing to its depth information, the parallax originally determined according to the depth Information is set to be variable. 
With this arrangement, upon an instruction by a viewer to change a parallax, for example, it is possible to create and 
display a pseudo stereoscopic image which is agreeable to the preference of the viewer. 

50 Accorcfing to a further aspect of the invention. In creating a stereo image by giving a parallax to a 2-D Image accord- 
ing to its depth information and displaying the stereo image on a stereo image display apparatus, a process to be con- 
ducted to the 2-D image so as to cause the parallax is determined according to a display condition unique to the stereo 
image display apparatus. The display condition is governed by the size of a display screen of the display apparatus, and 
an assumed distance from the display screen to a viewer. 

s? According to a further aspect of the invention, in creating a stereo image by giving a parallax for every image part 
of a 2-D image according to Its depth information, an uneven image frame outline caused by the given parallax is cor- 
rected. More particularly, in giving a parallax, if an image area shown at the riglrt end of the screen, for example, is dis- 
placed slightly rlghtwand. the image area resultantly projects off the original shape of the image frame, and thus causes 
uneven parts along the edge of the image frame. A correction made to such an uneven part would straighten the 




5 



BNSDOCiO: <EP 073SS12A2J.> 




EP0 735 512 A2 



appearance of the frame. The correction may be made by uniformly cutting off a peripheral part of the frame at a certain 
width, so as to achieve a desired shape of the image frame. 

According to a further aspect of the invention, in a method where image processing is carried out for a 2-D image 
according to its depth information, an image area subject to the image processing is determined, t>ased on the depth 
5 information With this arrangement it is possible to separate an object or to change the scale of an object a certain dis- 
tance from a viewer 

According to the further aspect of the invention, in a method where image processing is carried out on a 2-D image 
according to its depth information, images with viewpoints at a plurality of points on a hypothetical moving path, where 
a shooting point of the 2-D image is hypotheticaliy moved, are created for use as a slow moticHi image, based on the 
10 depth information 

It is to be noted that, according to the present invention, a viewfinder image seen from a different point may be cre- 
ated according to depth information.. A positional displacement of respective image parts, which will be caused accom- 
panying a change in the view point, are osculated based on d^th information, so that a viewfinder Image is re-created 
so as to correspond to the positional displacements caused When a viewpoirrt is changed In height, for example, a dis- 
75 placement (the extent of translation and rotatbn movements) of the object (respective image parts) can be calculated 
based on the distance by which the camera has moved and the depth information, so that a desired viewfinder image 
will be created based on the calculation result 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 

Rg . 1 is a viewfinder image where an object moves from left to right while a background stays still . 
Rg 2 shows reproductions of right and left images having a time lag between them. 
Rg 3 shows a parallax caused due to the lapse of time of Fig. 2. 

Rg. 4 shows main stages for the creation of an image for a 3-D display according to Embodiment 1, 
25 Rg . 5 is a flowchart for the detection of a corresponding relationship between viewfinder frames. 
Rg . 6 shows representative points provided in a reference frame t 
Rg 7 shows block matching: 

Rg 8 is a conceptual model where the value of El is indicated for a tentative corresponding point Pf(i,j) in a per- 
pendicular direction 

30 Rg . 9 shows a relationship between a representative point ard its corresponding point determined at 812. 

Rg. 1 0 is an explanatory diagram regarding a princq3le of evaluation of a relative posit'on of corresponding points. 

Rg 11 shows a result of improvement processing conducted on candidates for corresponding pdnts in Rg. 9. 

Rg 1 2 shows a relationshp between movements of Point P on a screen and a 3-D space. 

Rg. 13 is an explanatory diagram regarding a principle of determining 3-D coordinates of Point P. based on the 3- 
35 D movement of a camera and the movement of Point P on a screen. 

Rg . 1 4 shows r epr esentative points each given an actual numeric value. 

Rg. 15 shows a parallax given according to d^th infbrmatioa 

Rg 1 6 shows right and left images created from Frame t 

Rg 1 7 shows a non-linear transformation with respect to a parallax . 
40 Rg. 18 shows an example of a hardware structure for practicing Embodiment 1 . 

Rg. 19 is a monochrome picture showing a viewfinder image in Frame t 

Rg. 20 Is a monochrome picture showing a viewfinder image in Frame t'. 

Rg. 21 is a monochrome picture of Frame t overlaid with a grid for division, and provldKl with representative points 
Rg. 22 is a monochrome picture showving tnitia! positions of corresponding points In Frame t*. 

45 Rg 23 is a monochrome picture showing corresponding points at improved po^tbns in Firame t'. 
Rg. 24 is a monochrome picture embodying depth information with a gray-scale image 
Rg. 25 Is a monochrome picture of a right image created according to d^th information, 
Rg 26 Is a monochrome picture of a left image created according to depth information. 
' Rg. 27 shows main stages for the creation of an Image for a 3-D display according to Embodiment 3. 

50 Rg 28 shows a selection criteria with respect to a characteristic point which is introduced in Embodiment 3. 

Rg 29 shows a corresponding relationship of an original viewfinder image and one r&created so as to be seen 
from a changed viewpoint 

Rg. 30 shows a corresponding relationship of an original viewfinder image and one re-created so as to be seen 
from a changed viewpoint 
55 Rg 3 1 shows an image with a part expanded 

Rg. 32 shows an image with a house separated from the image in Rg. 29. 

Rg 33 shows a structure of a stereoscopic image display apparatus in Embodiment 8. 
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DESCRIPTrON QFTHE PREFERRED EMBODIMENTS 

Preterred embodiments of the present eiribodiment will next be described with reterence to the accompanying 
drawings In Embodiments 1 to 4. an apparatus outputs as a final image an image for a 3-D display (a pseudo sta-eo- 
5 scopio image), while in Embodiments 5 to 7, it outputs an Image for a 2-D display (an ordinary 2-D image). 

In Embodiments 1 and 2. the apparatus initially receives a viewf inder image shot on a monocular camera, while in 
Embodiment 3 it receives a viewfinder image shot on a multi-eye camera (a stereo image) Embodiments 1 , 2, and 3 
cone^ond to Embodiments 5, 6 and 7, r^ectively, except that the former outputs an inrage for a 3-D display and the 
latter for a 2-D display EntiDodment 8 relates to a displaying method, in which unique conditions for a display apparatus 
w are considered when displaying a pseudo stereoscopic image. 

Embodiment 1. 

Rg. 4 shows main stages for the creation of an inrage for a '3-D display according to Embodiment 1 . Up to the third 
IS stage, the content of the method tor calculating depth infonnallon according to the present invention will become appar- 
ent. 

In Embodiment 1 . an image for a 3-D display is created based on an image for a 2-D display through Stages 1 to 3 
for analyzing an image for a 2-D display and Stage 4 for creating an image for a 3-D display Respective stages will next 
be outlined 

20 

[Stage 1] Extraction of 2-D Motion Information 

Information about the movement of an object shown in a viewfinder image is first extracted. The nrwtion information 
is 2-D at this stage. That ts\ coordinates are overlaid onto a display screen, so that the movement of the object on the 
25 screen will be expressed by means of 2-D coordinates 

In order to understand the nnovemenl of the object, a corresponding relationship between viewfinder images is 
detected A viewfinder image at Time t is designated as a reference frame (hereinafter referred to as "Frame t"), while 
a viewfinder image at Time f is designated as an object frame (hereinafter referred to as "Frame t^ In Frame t a plu- 
rality of representative points are pre-provided. so that corresponding points of the repr^entative points are traced In 
30 Frame t' Frames t and t' constitute different-time frames with each other, though they are not necessarily to be adjacent 
in terms of frame sequence . Stage 1 is characterized by the fact that 2-D information can be extracted from not only a 
horizontal movement of an object but also from movements in any direction. Hereinafter in this specif icatfon. (t) and (f ) 
are defined as time, and a frame is defined as a unit constituting a viewfinder inrage \n general, but is not Hmited to a 
particular frame of a television picture receiver, whicfi comprises 525 scanning lines, or a screen of a personal compu- 
35 ter, which corrprises 640 x 480 pixels or the like. Alternatively, representative points may be provided in not only Frame 
t but also both Frames t and t*. 

[Stage 2] Calculation of 3-D Motion Information 

40 After identifying the 2-D movement of the object, information about an actual 3-D movement of the object 6 calcu- 
lated as 3-D motion information. The 3-D motion is expressed by six parameters: three for translation and three for rota- 
tion. This calculation is made based on a plurality of pairs of representative and corresponding points. 

[Stage 3] Aoquisilion of Depfli Information . . • =: 

45 

Identification of the actual movement of the object would define a relative positional relationship between the 
objects at different times. Further, identification of this relationship could provide depth information of the object or its 
respective parts (hereinafter referred to respective image parts) 

so [Stage 4] Creation of Image 

A parallax is deterntined based on the depth information, so as to create right and left images The parallax is deter- 
mined such that a closer object has a larger parallax Since respective inrage parts should have a different depth, right 
and left images should be created such that the respective image parts of each image have a different parallax. It is to 
55 be clearly understood that the following facts are different from each other and should not be confused; that is, the fact 
that the motion information can be extracted from a movement in any direction at Stage 1 , and the tact that the direction 
in which a parallax is provided is limited to a horizontal direction at Stage 4 due to the horizontal locations of both eyes 
viewing the object. 
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Respective stages in Embodiment 1 have been outlined above. In the follovwng, they will be further described in 
detail 

[Stage 1] Extraction of 2-D Motion Information 

5 

Fig 5 is a flowchart for detection of a coriesponding relationship between viewfinder image frames, respective 
steps of which will next be described one by one 

(Step 10) Providing a RqDresentative Point in Frame t 

TO ... 

As shown in Fig 6. representative points are provided in a Reference Rame 1 In Fig. 6, Rame t is divided into 
every 8x8 pixels by overla^ng it with a grid, and r^resentative points are provided at every crossing pdnt of the hor- 
izontal and perpendicular lines of the grid The representative point of the i-th from left and the j-th from the top is 
expressed as Pt{i j): a corresponding point of Pt(i J) at time r is expressed as Pt'(i j) The x and y coordinates of Pt(i,i) 
/5 are expressed, if required, as Pt(i.j)x and Pl(i j)y. respectively 

A lepresentative point may be provided not only at a crossing point txil at also any desired points As an extreme 
case, all pixels may be individually designated as independent representative points 

(Step 11) Setting a Corresponding Pbint Candidate Area 

20 

Taking an example off R(6,4) \n Fig. 6.. an area which may possibly include Pf (6,4) is pre-delermined based on the 
assumption that Pt'(6, 4) be positioned in the vicinity of Pt(6. 4) exc^t for a drastic nnovement of a viewfinder image so 
as to exceed a predetermined limitation In Embodiment 1 , in order to reduce calciiation for a positionai detection, 
Pr(6.4) is assumed as existing in an area of 1 00 x 60 pixels in the vicinity of R(6.4) 
2$ Step 1 1 can also be modified as follows: 

1 When a viewfinder Image moves relatively drastically, two frames adjacent in terms of frame sequence are deter- 
mined as Frames t and t' so as to minimize the extent of a change in the position of representative point between 
Rames t and r. and thus the risk of di^lacing the corresponding point from the assumed area. Of course, it is pos- 
30 sible to assume the whole image area as a corresponding point candidate area. The risk of displacing a cone- 
spending point from the assumed area due to a large movement of the viewfinder image is thus reduced, although 
the volume of calculation is resultantly increased 

2. In the above, a corresponding point candidate area has been determined based on a simple assurrption that 
35 Pi\eA) be located in the vicinity of Pt(6.4) . However, when the movement of Pt(6,4) is traced among a plurality of 
frames, a corre^onding point candidate area can be determined on the extension of the movement tral. This 
method is particularly advantageous in limiting such an area in the case of a viewfinder image with a relatively con- 
stant movement. 

40 (S 1 2) Calculation of Non-Similarity in a Corresponding Point Candidate Area 

The position of a corresponding point is specified in the corresponding point candidate area, in this case, a problem 
arises when the viewfinder Image moves considerably slowly contrary to Step i 1 . That is. when the viewfinder image 
moves only by a snnall extent, it is difficult to extract motion information and thus the risk of a severer error being 

45 Included in information is increased. " ' 

In order to prevent such a problem, Time f is pre-selected such that Frames t and V are set apart from each other 
by some extent. In other words, after conducting statistical analysis to the extent of changes of respective image parts. 
Time f is selected such that the magnitude of changes or the variance of the extent of changes^ exceeds a predeter- 
mined value. Altemativeiy. Time t' may be selected such that the total sum of or the variance of. the movements of more 

so tfian a predetermined number of characteristic points (described later) exceeds a predetermined value. If such Time f 
that meets tiie above conditions is not found, the creation of an image for a 3-D dsplay (or the calculation of depth infor- 
mation) is discontirttjed and. instead, an input viewfinder image will be output intact or all image parts off the viewfinder 
image are displayed as if having a uniform depth 

In this step, in order to determine the position of a corresponding point, non-similarity between Frames t and f is 

55 computed by fcdock matching method. That is. the total sum of squared differences of gray-scale levels (non-similarrty) 
is conputed between one block having a certain point as its center in the corresponding point candidate area, and 
another block including the r^esenting point, so as to detect a certain point providing the minimum sum, which is then 
determined as a confuted corresponding point. 
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Fig 7 shows block matching. In Embodiment 1. nine pixels constitute one block with the central pixel as a repre- 
sentative point of the blocH. 

Block 1 is provided on Frame t. including Pt(i j). wWIe Block 2 is provided on Frame f. including Pr(i,i). that is a ten- 
tative candidate for a corresponding point With a pixel value of a pixel(x,y) a! Time t designated as lt(x.y). the non-sim- 
5 ilarity (hereinafter referr^ to as El) is generally obtained from following Equation 1 . 

£1= rs{H(R{ij)x+u,Pt(ij)y+v)-ltXPt'(ij)x-KJ,PlXiJ)y+^^^ ^ [Equation 1] 

wherein two i s relate to u and v. Since u and v re^ectively take the values of 

U = -1.0.1 
v = -1,0, 1 

15 foi a tentative Pr{ij). a squared difference of gray-scale level can be obtained with reject to the nine pixels in total. 
Then, while gradually changing the portion of Pt'(i j) within the candidate area, a point with the minimum El value is 
determined as a corresponding point 

Fig 8 is a conceptual model having the value of El in a perpendicular direction for every Pt'(i j) In this model. Point 
Q is determined as a corresponding point, since it shows a steep peak in non-simitarity In this way. corresponding 
30 points of all representative points are determined 
Step 12 can also be modified as follows. 

1 In the above, a squared difference of gray-scale level has been calculated, as a non-similarity, from a gary-scale 
image Though, in a color image, the non-similarity may be the total sum of squared difference of gray-scale levels 

25 in red. green and blue, that is El ^ + El q + El q . Alternatively, the density of other color spaces, such as an HVC 
density, may be anployed. or the total sum of r^idual differences may be employed in place of a squared differ- 
ence of gray-scale level 

2 In this step, nine pixels constHute one block, though it is preferable that one block is defined including a relatively 
30 large nunt>er of pixels For example, with a screen having a high resolution, such as that of in a personal computer. 

a work station or the like, experiments have shown that a good result was obtained in case of a block including 
around 16 X 16 pixels 

(SI 3) Deternranation of an Initial Position of a Con-esponding Point 

35 ■ ' 

Up to Step 12. a tentative corresponding point has been determined, though it may not be positioned correctly Cor- 
responcfing points relating to borders or edges of an object may have been determined witii satisfactory accuracy, 
though ft should be understood that points relating to less characteristic image parts may have been detennined with 
considerable errors Such a problem is likely to arise In a case where the vajue of El does not show a definite peak in 
40 Rg. 8. or the like Fig 9 shows a relationship between a representative point and its corresponding point, the corre- 
sponding point being determined up to Step 12 Apparently, although corresponding points relating to characteristic 
parts, such as a house and a tree, and espedally their outlines, are positioned with satisfactory accuracy, points relating 
to the sky or the ground are positioned with considerable errors, 
ili- " In Step 13 and subsequent Step 14, therefore, such inaccuralely positioned corresponding points are adjusted so - 
45 as to be at a coned position. In Step 13. the concept of an initial posifion Is introduced, so that the initial position of 
each of the conespondbig points is actually determined in this st^. In subsequent Step 14. the positional accuracy Is 
improved through repeated calculations. 

The initial position is determined, following either way stated below. 

so 1 . All con-esponding points whidi have been detemnined up to Step 12 are equally processed in Step 13. 

Positions where all corresponding points are now located are regarded as their initial position for the subse- 
quent processing 

2- Con-esponding Points are Processed Differently 
55 As for con-esponding points whose positions may have been determined with satisfactory accuracy (hereinaf- 

ter referred to as a characteristic point), positions where they are now located are regarded as their initial positions 
On the other hand, as for other conesponcfing polrts (hereinafter refen-ed to as a non-clTaracteristic point), their ini- 
tial positions will be detennined based on those of the characteristic points. The conesponding points mentioned 
below can be candidates for a characteristic point, though conesponding points of the following (1) to (3) are likely 
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to coincide. In this specification, representative points of the con-esponding points as a characteristic point are also 
refenred to as a characteristic point 

(1) A corresponding point having a definite peak in the value of E1 in Step 12. 

5 Generally, such corresponding points are quite likely to have been positioned with a high positional accuracy. 

(2) A corresponding point located in an area including many orthogonal edge components. 

Con-espondrig points included in areas aroind edges of tsuildings are quite likely to have been conectiy posi- 
tioned. 

10 

(3) A corresponding point whose position varies steadily from Frame t to t' and further 

The steadiness may be understood as the consistency of a motion vector. Therefore, a corresponding point 
moving in a consistent moving direction by a consistent distance as a frame proceeds frames t to t'. should further 
be sheeted as a characteristic point Concretely speak^» a corresponding point to be selected should have a 
15 motion vector vtfhdse variance is lower than a predetermined value, because sudn a corresponding point must have 
t^een traced precisely among respective frames, and thus having been judged as having a correct corre^randing 
relationship with its r^resentative point However, when the camera has moved irregularly, the influence thereof 
must be considered in the judgement 

20 When a characte^-istic point is determined, its position is used as an initial position, while the initial position of a non- 
characteristic point will be interpolated by using neighboring characteristic points In other words, since the positional 
accuracy of a non-characteristic point determined up to Step 1 2 is low, their initial positions should be determined geo- 
metrically based on the neighboring characteristic pants with high positional accuracy Of course, the method of Step 
1 2 can be utilized in finding a characteristic point described in the above (3). 

25 in addition to the above-mentioned methods teased on the selection of a characteristic point, the initial position of a 
corresponding position may be determined by a dynamic programming method. 

(814) Improvement Process for the Position of a Corresponding Point 

30 An equation is introduced for evaluating positional accept^ility of corresponding points, so as to improve the rela- 
tive positional acceptability through iterative calculations with frie equatioa That is. in addition to Equation 1 in Step 12. 
another equation is introduced for evaluating acceptatorrty of a relative positional relationship between corresponding 
points The evaluation results derived from both of the equations are conribined to improve the postional accuracy. 
Referring to Fig 10. the principle of relative positional evaluation will be described, Rg 10 shows corresponding 

35 points Taking Pf{i j) as a center, the following four corresponding points are located adjacent thereto: 

Pr(i-t. i). RXi+1. j). Pt'(i. Pt'O. 

It is reasonably assumed that Pf(i.J) is located around the center of gravity of these four points This assumption is 
40 based on the experience that, even when respective image parts move, their relative positional relationship is substan- 
tially maintained. This experience can be mathematically explained as being equal to a situation where a quadratic dif- 
ferential of Pr(iri). which is a function of i and j. is substantially zero.. 

Therefore, with ttie center of gravity of the four points bdng expressed as (Sr0.i)x. Sr(i,i)y). Equation 2 

45- : - ^ - - - K^^^ [Equaiionai 

is obtained for evaluating relative positional acceptability. With consideration of Equation 2 only, a corresponding point 
wnll be most favorably positioned with the nunimum E2 value. In otiier words, relative positional acceptability of images 
is evaluated using the fucntion of distance between neighboring image parts. 
so In this step, evaluation results derived from Equations 1 and 2 are combined w^ an appropriate coupling Victor k 
Therefore, a final evaluation equation E can be expressed as Equation 3 

EsE1/N4-kE2 [Equation 3] 

55 wherein N is the number of pixels included in one block, w^ich has been determined for block matching In otfier words, 
for the improvement of tine relative positional acceptability, E is first computed witii respect to all of tine corre^onding 
points Then, after adding all E's into ZE. the respective con-esponding positions are moved gradually so as to ntinimize 
the value of ZE, This computation is repeated until either the value of £E converges or the conputation is repeated up 
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to a predetermined number of Iterations Thai is. concretely speaking, any of the following methods is practiced while 
moving respective corresponding points. 

(1) A Method Using an Euler-Lagrange Differential Equation 

5 

When an Euler-Lagrange differential equation expresses £E taking an extremum (a relative minimum in this case), 
a corresponding point fe obtained by solving this EiJer-Lagrange differential equation. This is a known method. Accord- 
ing to this method, a direction in which a conresponding point is to be moved for improvement from its initial position is 
determined based on both gradient in respective blodis including a representative point and cfifferential between corre- 
10 spending blocks, so that the corresponding point is gradually moved In the direction from its irdtial point until reaching a 
final solution 

(2) Fixed Searching Method 

15 In a corresponding point candidate area, a point is searched where the value of E of a corresponding point to be 
improved becomes minimized, and then newly set as a corresponding point. The fixed searching method is character- 
istic in that the search is conducted lor one corresponding point, while others are kept fixed. The ^ove mentioned proc- 
ess is repeated with respect to all corresponding points. 

20 (3) Hybrid Method 

According to the method (1). it is possible to position a con-espondlng point with an accuracy theoretically in unife 
of less than a pixel, while according to the method (2). with an accuracy in units of a pixel. Therefore, it is possible to 
utilize both methods, that is, first applying the method (2) to obtain a cwresponding relationship with an accuracy in 
25 unitsof a pixel, and then the method (1) to enhance the accuracy. 

Experiments have shown that the method (2) provides a favorable solution in a shorter period of time than the 
method (1) used to obtain the same level of accuracy. 

Rg . 11 shows the result of improvement processing according to this step which has been conducted with respect 
to candidates for corresponding points shown in Fig. 9. Experiments have shown that a favorable result vias obtained 
30 in a color image, with the coupling factor k at around 5 to 200. Rgs 9 and 1 1 show model results, though actual exper- 
iments have proved that improvement close to the model results was realized. 

This step is characteristic in that 2-D motion information can be extracted from the movement of an ot^ect in any 
direction. This Is an advantage achieved through understanding of the movement of an object by inUodudng the con- 
cept of representative and corresponding points This advantage makes the present invenlbn applicable over a wider 
35 range, compared to a prior art. in which a time difference has been determined through the detection of a horizontal 
movement. 

Step 1 4 can be also modified as follows: 

1 In obtaining E2. a center of gravity of eight points may be determined, the eight points Including four points diag- 
onally located from the center, that is Pt'(i.i) in Fig 10, as weR as the four respectively located upward of, downward 
of. to the left of and to the right of the center. Preferably, the optimum combination of the points is determined exper- 
imentally, as it depends on the kind of 2-D image to be processed. 

2. Evaluation by Equation 3 should begin with a corresponding point whose evaluation result of Equation E2 is not 
favorable, because a drastic Improvement to such a corresponding point at an ear^ stage is preferable, as it is gen- 
erally considered to have a large error. 

3. For the improvement of the positional accuracy, geometrical information should be utilized As for a plurality of 
representative points forming an area with a geometrical feature, such as a straight line, in Frame t. positions of 
their con-esponding pwnts should be corrected so as to also form the same geometrical feature. This con-ection is 
made for reasons that a part which seents to be a line in a viewfinder image is quite likely to form a line in the actual 
3-D worid as well, and a line In the 3-D world should form a line in Frame V as weB Since the depth of an image 
varies consistently along a line, and because such a linear variation can be visually recognized witii ease, the cor- 
rection by the above mentioned method will achieve a significant improvement. Without such an improvement the 
final image may include irregularity in depth along a line, thus possibly resulting in an unnatural 3-D display As 
alternative geometrical information, edges of an image area can be used. 

4 Further, con-esponding points are obtained with respect to otiier frames as well In this stage, corresponding 
points are only obtained in Frame V wwth respect to Rame t. though it is possible to obtain corresponding points in 
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a third frame, or Frame T, so as to obtain averaged movements of respective image parts This method is not for 
improving a relative positional accurac/ of the corresponding points in Frame \\ but rather for statistically determin- 
ing the movements of respective image parts, based on the respe<^ve positions of corresponding points, which 
have been provided in many frames, and the respective times when the respective frames are shot 

5 

5 When an insufficient number of chaiacteristic points are estaiDlished. the ongoing process is discontinued 
because it is quite unlikely thai an accurate corresponcfing relationship wBI be obtained. 

[Stage 2] Calculation of 3-D l^otion Information 

In Stage 1 . the 2-D movements of respective image parts on a screen have been identified In Stage 2. 3-D move- 
ments thereof are calculated based on the Identified 2-D Information That is, since the 2-D movement in a viewfinder 
image ts a projection of the actual 3-D movement of an object onto a plane, the original 3-D movement of the object is 
calculated based on ttie positional relationship between representative and corresponding points in a viewfinder image. 
;5 Movements of an ot^ject tn the 3-D world can be generally desaibed as a combination of translation and rotation 
movements In the following, a method for calculating a movement conripnsing translation movements only will be 
described first followed t>y an example of a general'ized method 

1 . Translation l^^ovements Only 

Rg 12 shows a corresporKfing relationship between the movement of Point P on a screen and its actual movement 
in a 3-D space In Fig 12. the 2-D coordinates are expressed with a capital letter, while the 3-D coordinates are 
expressed with a small letter, in which x and y axes are provided on the surface of tiie screen, while the z axis is in the 
depth direction. The distance from the viewpoint and the screen is set as 1. 
25 As shown in Fig 12. P(X,Y) moves to P'(X\Y*) in the 2-D screen, while S(x,y.z) simultaneously moves to S(x'. y'. z*) 
in the 3-D space 

When the following equation is held 

(x\y\z*)=(x,y.z)+(a.b.c) 

30 

since tiie screen is placed a distance of 1 from the viewer. X. Y. x' and y* can be expressed as follows: 

X=x/z. Ysyfe 
35 x=x7z', Y'=/fe' 

By solving tiie above, the following is inUoduced. 

X'r;(X2+ay{z-K:) 

40 

r=(Yz+b)/(z+c) 

Therefore, with z eliminated. Equation 4 is obtained. 

45 (a-X*c)(Y'-Y) = (b-Y'c)pC'-X) - [Equation 4] 

Since Equation 4 is expressed in terms of movements on the screen, it is possible to obtain unknown values of (a), 
(b). and (c) according to tiie information obtained in Stage 1 . However, although, in an actual situation where an object 
that is k times larger moves at a speed k times higher to a place away k times further, the value of k (a scale factor) 

so cannot be deternruned. tiie ratio of values of (a), (b) and (c) to one another can be solely obtained Mathematically 
speaking, even if three pairs of (X.Y) and (x*.y*} are given, since the rank of a coefficient matrix of this simultaneous 
equation is as low as two. (a), (b). and (c) cannot be determined as real values but only as relative values Therefore, 
in this stage, the value of (c) is normalized to one, so as to express the values of (a) and (b) as a ratio against (c) 
because a ratio is sufficientiy usable In the subsequent processing. 

55 An alternative solution with respect to translation movements is as follows. An error (e) is defined from Equation 4. 
as Equation 5. 

e={(a-X*c)(Y*-Y)-(th Y'c)(X'-X)) ^ [Equation q 
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e={(Y'-Y)a-(X'-X)b-{Xr-XY)cl} ^ 

Then, the total sum le of all (e)*s regarding all corresponding relationships between representative and corresponding 
points is calculated, so that the respective values of (a), (b) and (c) are obtained from Equations B to 8 so as to minimize 
the value of Ze. 

d(Se)/da=0 [Equations] 
d(Ze)/db=0 [Equation?] 
.d(i:e)/dc=0 [Equations] 
More concretely speaWng, Equations 6 to 8 are respectively developed Into Equations 9 to 11 . 

a2:(Y-Y) ^-b5:(X*-X)CY'-Y)-cZ(r-Y)(Xr-X'Y)=0 [Equation 9] 

-ar (K-X)(V-Y)+b3:(X'-X) ^+cr(X'-X){Xr-XY)=0 [Equation 1 0] 

-aI(Y'-Y)(XY'-X*Y)+bX(X'-X)(Xr-X*Y)+ci:(XY-X'Y)^=0 [Equation 11] 

2. Movements Including Rotations 

Movements including both translation and rotation can be expressed by means of three displacements in x, y and 
z axial directions and three rotation angles, such as a, p. and y. each having a respective one of the x, y and z axes as 
an axis of rotation. Rotation angles can be expressed by means of an Eulerian angle or a roll pitch method. 

The values of the above six variables are tiie next to be obtained. However, as explained ^ove, since a scale factor 
cannot be determined, the ratio of the variables to one another is solely obtained, assuming one of the variables as one. 
it is theoretically possible to specify a movement when given f ive pairs of representative and corresponding points 

However, it is to be noted that depending on the selection of the pairs, the content of movements may not be spec- 
ified by means of solution on a linear transformation in some cases However, it is known that the selection of eight pairs 
could prevent such cases, grounds of which can be found in references, such as "On the Linear Algorithm for Monocular 
Stereo-Scopy of Moving Objects" by Deguchi and AWba. Transactions of Sodety of Instruments and Control Engineers. 
vol .26. No.6. 714/720 (1990). 

[Stage 3] Acquisition of Depth Information 

Relative extent of the 3-D movements of the respective image parts have been identified in Stage 2. In Stage 3. 
depth information of the respective image parts is obtained based on the relative extent In the following description, it 
is assumed that an object stays still, while the camera shooting Uie object moves instead. For this stage, since relative 
movements between an object and a camera is the target question, this assumption can be made. 

The movement of a certain part in a viewfinder image Is expressed by means of a rotation matrix R and a transla- 
tion vector (a.b,c) as follows: 

-r . .. .. .. (x*y.27»R(x,y;z)+<a.b,c4? - ^- " 

The inverse transformation of this equation, which is expressed as the following Equation 12, is conslderal to be the 
movement of tiie camera. 

(x.y,2)=R'\(x\y\z>{a,b.c)} [Equation 12] 

Referring to Fig 13. the principle for obtaining 3-D coordinates of Point P. based on 3-D movements of a camera 
and 2-D movements of Point P on a saeen vM be explained. This principal is generally known as one for triangulatlon. 
in which, when viewing the direction of Point P from two separate points. Point P (Point S in Fig. 13) actually exists at 
the crossing point of the lines of sight from the two points. 

In Rg. 13. it is assumed that a camera is moved as indicated by the arrow from Time t to f according to Equation 
1 2 . Point S is projected at Point Pt in Frame t and at Point PV in Frame f. Pdnt S being a crossing point of lines Ll and UV 

Since angles ei and ef , which are formed by the direction in which the camera faces and Une Lt and Lt'. respec- 
tively, are known and the direction in which tiie camera moves and its moving distance have been identified, it is possi- 
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bie to obtain 3-D coordinates of Point S Based on the 3-D coordinates of Point S. the respective image components 
can be provided with their depth information. 

It is to be noted that, as described above, due to the normalization of (c) as 1 . the obtained 3-D coordinates of Point 
S are having been expanded or compressed by a uniform ratio. However, since it is uniformly expanded or compressed 
5 as a whole, the deptii information retains a correct relative positional relationship among respective image parts 

In the atkove mentioned processing at this stage, it is necessary to consider enors wrhich have been caused up to 
the previous stage. In other words, due to such errors. Lines Lt and IX often do not cross each other as a result of cal- 
culation To cope with such a problem, a point is provided at the middle of a line connecting points on Lines Lt and Lt' 
where the lines are closest to each other, so that a 2 coordinate of such a point will be approximately designated as a 
TO depth of Point S This process will next be descril>ed using an expression 

When the direction vectors of Lines Li and Lt' are respectively expressed as (u.v.w) and (u*,v*.w'), both Lines L and 
L' can be expressed as following Equation 13. iBlng parameters of a and p (real number). 

Lt : (x,y.z)+a(u.v.w) [Equation 13] 

15 

Lt' : (x\y\2')+p(u\v*,w') 

Therefore, when an en^or (e) is expressed as the following 

20 eo{{x+pu}-(x*+au*)) ^+{(y+Pv)-(/+aV)) ^ +{(2+pw)-(2*+aW)} ^ 

the values of a and p which minimize the value of (e) are obtained using the expressions: de/dasO and de/dp«0 in 
other words, by solving the equations of 

25 (U^+v^+w^)a-{uuVw*+ww')p+(x-x*)u+(y-y')v+(2-2*)w=0 

(if ^+w' ^ )p-(uuVw'+ww')a+(x-x*)u'+(y-yX+(z-z')w'siO 

the values of a and p are determined, so that the depth of Point S is finally expressed as the foflowing, 

30 

{{z+aw)+(2*+pw')}/2 

Especially in the case that the error (e) is zero, the (z) coordinate of the midpdnt coincides with that of the aossing point 
of Lines Lt and Lt*. 

35 As an alternative method, Lines Lt and Lt* are both perspectively projected onto tiie screen of Frame t so as to 
obtain the (z) coordinate of the closest point of Lines Lt and ii\ In this approach. Line Lt is projected as one point on 
the screen. wNle Line Lt' is one line in general. With Line Lf repressed as Equation 13, the (x) and (y) coordinates of 
the points on the projected Line Lt* on the screen are expressed as Equations 14 and 15 by dividing (x) and (y) coordi- 
nates of the points on Line U' In the 3-D space by their (z} coordinates, respectively 

40 

xstCx'+puyCz'+pwO [Equation 1 4} 

ysf(yVpv')/(z'+Pw') Equation 1SJ 

45 wherein (f) Is an actual distance from the wewpoint to the screen of Firanne t, which can be set as one. By elintinafing p 
from Equations 14 and IS. Line Lt' after being projected (hereinafter referred to as Li) can be spedfied as follows. 

Kx+my+-fns50 

so wherein l<=v'z -wy , m«*wtf*y-u'z* , n«uy-vy . The closest point to be detected is a foot of a perpendicular from the rep- 
resentative pcMnt Pt to Line Li (hereinafter referred to as Point D). that is. a point where a tine drawn from the represent- 
ative pdnt R meets Une Li so as to form a right angle, and the coordinates of Point D are expressed as following 
Equation 16. 

55 x=(m^X-kn-kmY)/(k Vm^) [Equation 16] 

y =(k ^ Y- mn'kmX)/{k ^+m ^) 
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Assuming thai the original point on Une Lf in the 3-D space, which corresponds lo Point is designated as Point E 
(x^y",z"), Point E can be detected by substituting Equation 16 Into Equation 14 to obtain p, and further substituting the 
(Stained value of p into Equation 1 3. Since p is expressed as 

5 p = (xzMx')/(!u'-xW), 

by substituting this expression into Equation 13, the (2) coordinate of Point E. that is 2", is determined as the following: 

2"=:2*-fW'(X2-fx')/{fu'«XW*) 

This can be used as a depth value of Point S. 

When the depth value is negative due to errors caused in image processing, the computed value is not reliable 
because the negative value means that Point S exists behind the camera. Therefore, the (2) coordinate of Point S needs 
to be obtained in some other way, such as by using representative points close by with a positive value 
T5 Irrespective of which method is utilized, the computed depths of the respective image parts should be given to the 
respective representative points as an actual numerical value. Fig 14 shows representative points, each of which is 
given an actual numerical value. For example, the depth of Pl(2,3) and Pr(4,3) are respectively 100 and 200. the latter 
actually behg located twice as far away as the former. 

20 [Stage 4] Creation of Image 

A parallax Is determined according to the depth information, which has been acquired in Stage 3, so as to create 
right and left images. In this stage, a farther image is to be provided vtfith a smaller parallax 

In Fig. 15. which is a top view of the whole system inclucfing an object and a camera, parallaxes are given according 

25 to depth information. When Pt(2.3) and Pt(4,3) of Rg 1 4 are provided on a viewf inder image shot by the camera under 
the situation shown in Fig 1 5. their actual positions are at St(2.3) and St(4,3), respectively, the latter being located twice 
as far away from the camera as the former 

R and L screens and R and L viewpoints are respectively placed as shown In Fig. 15. the R and L viewpoints 
respectively corresponding to right and left eyes of a viewer Then, Sl(2.3) and 81(4,3) are projected on each of the R 

3Q and L screens by viewing them from respective R and L vlevvpoints. This projection is canied out with respect to all rep- 
resentative points until a final image Is formed on each of the R and L screens. The final images can be used as right 
and left images, respectively By dispfaying such images on a display of lenticular lens type or the like, which is dis- 
closed in JP Application l-aid-Open No. Hei 3-65943. H Is possible to obtain a good stereoscqpic image 

In this embodiment, the stereoscopic image may be generated for a desired part only which has been separated 

35 from the image . Taking as an example a scene v^rhere a person is located 5 meters from the camera, with mountains in 
the background Image processing on a condition of ^'within 10 meters in depth" would make it possible to separate the 
area including only the person from the whole image area, so that right and left Images can be created with respect to 
only the area containing the person, while leaving the rest blank or pasting the area with the person on pre-prepared 
different images. This stage diHers from stages up to Stage 3 in the number of a viewf inder image frame to be used Up 

AO to Stage 3. at least two frames are used in extracting required information, though in Stage 4. one frame is sufficient for 
the creation of right and left images Ftg. 15 shows right and left images, which have been created using Frame t as 
reference, in which the image parts of a tree, a house and a person are respectively located a smaller distance from the 
viewer in this order. The Image part of the person, closest to the viewer, exhibits the following features: 

4s 1. hawng a largest displacement to the left in a right image. 
2. having a largest displacement to the right In a left image. 

It is respectively understood that the above (1) is a situation where the viewer sees the person from a point which is 
slightly rightward from the original viewpoint, and the above (2) from a point which is slightly leftwand from the original 
so viewpoint. As a result of these features, the person is perceived as being a smaller distance from, that is closer to. the 
viewpoint in Fig 16. the displacements of the respective image parts are indicated means of the movements of 
crossing points In the grid, in which the person, the house and the tree present a smaller displacement (a parallax) in 
this order 

For image aeation based on Frame t the r^pective divided parts of a viewf inder image in Fig . 1 5 are to be trans- 
55 formed- In this case, it is necessary to select either a linear or a non-linear transfonnation as follows. 
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1 . Non-Linear Transformation 

As is shown in Fig. 1 6. some of the divided parts are transformed into a trapezoids A widely-used linear transforma- 
tion, such as an affine transfbrmatron, however, cannot be affiled to such transformation. Therefore, in order to trans- 
5 form a part with four vertexes into a tr^ezoid. a non-linear transformation, such as a perspective transfbrmaticHi. is 
applied. 

2 Linear Transformation 

70 In the transformation into a trapezoid, provided that a pari witii four vertexes is first divided into two parts each hav- 
ing three vertexes. a linear transformation can be applied with respect to such a part. 

As a result of a horizontal displacement of respective image parts through the above mentioned transformation, the 
peripheral edge of the image may become uneven. In Fig 16, the bottom parts of right and (eft images are displaced 
Inwardly with respect to each other, and accordingly the peripheral edges of the displaced parts tsecome crooked. 

IS Therefore, by adding pixels to such a recess, the shspe of the image is corrected back into its original shape (a rectan- 
gle in this example) 

The depth of an Image part that falls on the added pixels is determined while referring to the depth of those dose 
to the pixels, or in other ways The images on the added pixels can be seen only by one of the eyes, which is a natural 
phenomenon and arises in an area close to a window frame when people look outside through the window It is to be 
20 noted that this correction can also be made by deleting redundant pixels, which project outward of the edge. Alterna- 
tively, the peripheral part of the image is unifbrnnly cut off by a certain w'dth With this correction, tnespective of the 
selection among the above methods, it is possible to maintain a natural display. 

In this stage, a parallax is determined according to a depth, though the parallax is preferably further adjusted for 
the following reasons. 



In the above example, it is not desirable for even a person closest to the viewer to be given an extremely small 
depth, because an image perceived to be excessively frontward from the screen would cause fatigue on ^e viewer's 
30 eyes. According to a f ^ort in "Nikkei Electronics** (April 4, 1 988, p.21 1). it is most desirable for respective image parts 
to be given a depth In a range between 0.2m to 2m, when a display s positioned 50 cm from the viewer. 

2. Personal Pr eference 

35 Some people prefer a close image to be displayed much closer and a distant image much farther, while others pre- 
fer the opposlta 

3. Processing Capacity 

40 If all image areas constituting a far background, such as a mountain, are displayed as if having the same distance, 
the volume of data to be processed can be reduced 

Because of the foregoing reasons, in this stage, the following functions for transforming a depth or a parallax are 
applied as requested. _ _ 



A depth is directly subject to either a linear or a nonlinear transformation. That is, the object of the transformation 
is a depth, and a parallax is resultantly changed. For example, as for a viewftrxier image compri^ng image parts with 
d^^s in the range of 1a to 10a ((a) being an arbitrary value), the d^ths of the respective image parts can be uniformly 

so multqslied by ten. so that all depths fall in the range of 1 0a to 1 00a This depth transformation function is advantageous 
for a viewfinder image with an excessively small depth as a whole. 

Alternatively, when the depth is in the range of 0 to 100a, the depth may be compressed, for example, to the range 
such as 25a to 75a with 50a as the origin of transformation. As a further alternative, all images having a depth of equal 
to or less than 20a. or equal to or more than 1000a may be transformed so as to have a uniform 20a or 1000a d^h. 

55 respectively In this case, however as a result of the uniform transformation, areas at ttie upper and lower limitation val- 
ues, that is 1000a and 20b. become discontinuous, and thus fonm an unnatural display in some viewfinder images. In 
order to solve this problem, a non-linear transformation is applied such that images smoothly converge at the upper and 
lower limitation v^ues In ttvs example, the following transformation should be made; 
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z-><i/{1+exp(-(x-0.6a)/aT)}+zO 

wherein (2) is the original depth, zO = 20a. a 1000a • 20a = 980a, and T o 4. 
5 2. Parallax Transformation Function 

A parallax is subject to a linear or a non-linear transformation That is, after a parallax is calculated based on a 
depth and transformed, the depth is re-calculated based on the transformed parallax. 

Rg 17 shows a non-linear transformation to a parallax, in which Point S, an object for the transformation, is pro- 
10 vided on a central line L and Point B is at the foot of a perpendicular from Viewpoint A to the Line L The depth of Point 
S is expressed by a segment SB, and the parallax 6 (strictly speaWng, a half of the parallax) is set as shown m Fig 1 7 

Taking as an example a case where the parallax is reduced to a half. That is. Point S is to be transformed to a point 
which satisfies the following Equation 17, that is Point S* 

e=e/2 [Eqij^on17] 

The depth of Point S' is expressed with a segment S3. A series of processes in connection with the transformation wilt 
be mathen^tically explained. First 0 is determined using the depth SB. according to the relationship of 

e«atan(SB) 

SB is next determined according to the relationship of 

S'B=tane' 
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so that SB will be used as depth information after the transformation. Since afar point is transformed to be much farther 
and a close point is much closer than through a sim^de linear fransformation. the sense of deptii is more effectively 
adjusted through this transfonnation. Equation 17 expresses a simple linear scaling, altirough a vanety of non-linear 
transformations, such as is described in 1 (Non-Unear Transformation), can also be wfied to the transfomation of 6 - 

3(7 >6 

According to Embodiment 1 of tiie present Invaition, an image is newly created based on depth information, 
instead of a con^ination of existing viewlinder images. Since this creation does not require a horizontal movement, 
which has been a mandatory in a conventional time difference metiiod. tiie present invention is appficable over a wider 
range In addition, since a method tor detecting a con-esponding point with respect to a r^esentative point is disclosed 
35 intiie present invention, it is possible to automate the extraction of depth infomiation and creation of an image wth ease 
areJ efficiency 

EmtXKiiment 2. 

40 The optimum apparatus for practicrig Emtjodiment 1 will be described 

Fig 18 shows a hardware structure for practicing Embodiment 1. ^ 
In Rg- 18, a viewflnder image to be processed is supplied via an image input circuit 20, vrtiereupon tt is converted 
into a digital signal. The digital viewlireJer image is stored by a frame memory control circuit 22 in a frame memory 24. 
Subsequent to the memory 24, a corresponding point detection circuit 26 Is provided for reading out a plurality of view- 
finder image frames for detection off con-esponding points. In the detection drcuit 26, the process at Stage 1 of Embod- 
iment 1 is practiced by means of hardware, in which an MPEG encoder or the like is used for block matching. 

The cooreJinates of con-esponding points, which have been detected in tiie circuit 26, are stored in a corresponding 
point coordinate memory 28, so as to be arbitrarily read out by a movement detection drcuit 30- In ttie movement detec- 
tion circuit 30. the processes at Stages 2 and 3 of Embodiment 1 are practiced, in which a 3-D relative position of tiie 
so object is calculated based on its translation and rotation movements 

The calculated information about ttie 3-D relative position is supplied to an image creation circuit 32, where ttie 
original digHal viewfinder image is retrieved from the frame memory 24 so as to create right and left images, respec- 
tively, by giving an appropriate parallax between them Prior to the image creation drcuft 32 an instruction input section- 
34 is provided for receiving several instructions from outside. 
55 The right and left images. wWch have been created in tiie image creating drcuit 32. are converted into an analog 
signal by an image output circuit 36, to be supplied to an uniliustrated display. 
The operation of the apparatus will next be described. 

A camera shoots an object so as to capture its viewfinder image. Or, a video equipment plays a viewfinder image. 
Such a viewfinder image is supplied via the viewfinder image input drcuft 20. so as to be stored in the frame memory 
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24 For a normal 2-D display, the supplied viewfinder images will be displayed intact, or alternatively Ihe viewfinder 
Images stored in the frame memory 24 are sequentially read out therefrom for display For a 3-D display, a plurality of 
frames on a viewfinder image are read out from the frame memory 24, so that depth information of the object will be 
obtained from the read-out frames, with the corresponding point detection circuit 26 and the motion detection circuit 30. 
Subsequently, the image aealion circuit 32 creates right and left images according to the depth information 
The instruction input section 34 can be structured as follows so as to achieve the following functions 

1 . Structured as a Control Knob 

The sense of depth on the created image can be adjusted so as to satisfy the personal preferences of a user by 
varying the sense of depth through scaling depth with a control knob The rotation of the knob may be adjusted in 
advance such that the minimized sense of depth will provide a 2-D display 

2 Structured as a Pointing Device 

(1) The sense of d^lh is adjustable in units of image parts. For exanple, when the person in Fig 16 is desired to 
be displayed much closer, a pointing device, such as a mouse, is used to point to the person, and then clidted. As 
a result, tiie image creation drcuit 32 transforms the deptii information of the person for use in an enhanced display 
by giving a wider parallax. The effect of this adjustment wfll become more significant H the display area of tine 
selected item is also changed together with the change of the sense of depth Concretely speaking, with the halved 
sense of depth, the display area of tiie selected rtem will be expanded four times. 

(2) A xflewfinder image seen from a different point can be created. Since the deptii information is avaUable. by des- 
ignating a shooting point {a viewpoint) of the object through clicking with a mouse, it is possible to compute tiirough 
a calculation of translation and rotation movements of the respective image parts, the movements to be caused 
accompanying the change of tiie viewpoint Therefore, a viewfinder Image to be seen from a different point can be 
created. In Fig 16. for example, a viewfinder image to be seen after tiie \flewpoint is changed in height or by 
advancing or witidrawing tiie camera, can be re-created. Furtiier. since tiie depth information of tiie re-created 
viewfinder image can be computed through tiie calculation, its 3-D display can be maintained in a good condition 
with the parallax changed according to the newly computed depth information. A viewfinder image seen from a dif- 
ferent point will be furtiier descrtoed later in Embodiment 5. 

In the following, the results of experiments, in which the apparatus of tiie presort inventicwi is mounted in a work 
station, will be described with reference to the drawings. 

Figs 19 to 26 show image creation procedures witii the apparatus according to the present invention. Each of the 
drawings is a B/W picture on a display which comprises an area including about 640 x 480 pixels. 

Rgs 19 and 20 are viewfinder images in Frames t and t\ l^espectively. exhibiting some movements between tiiem 
due to a positional cfifference of tiie camera. Rg 21 shows the same viewfBider image of Fig. 19, with a grid overlaid 
and r^resentative points provided. Fig 22 shows ttie same viewfinder image of Rg. 20, with conresponding points at 
their initial position, in which tiie initial position is set at the temporary best point The temporary best point is obtained 
through Wod^ matching, which has been conducted, beginning witii a characteristic point, witti re^ct to an area of 1 6 
X 16 pixels with a representative point at Its center. 

Rg.. 23 shows Inproved positions of oon-^ponding points, presenting a significant improvement from Rg. 22, as a 
result of Equation 3 in Embodiment 1 for considering a positional relationship between con-esponding points. 

Rg. 24 expresses deptii information at a gray level, where a lighter level represents a smaller depth. It can be seen 
from tiie drawing that d^th information has been acquired with considerable accuracy 

Rgs 25 and 26 are right and left images, respectively, which are created based on deptii infonnation. A closer 
object, a can in this example, is shown to have a wider parallax, and is tiius given a larger horizontal displacement 

As described afcxjve, with the present apparatus. It is possible to automatically practice the mettiod of Embodiment 
1 of the present invention. Further, tiie application of hardware for block matching confributes considerably to the 
improvement of processing speed, compared to an executing time required witii the application of a software. 

The present apparatus can be effectively embodied in a product by attaching an add-on-card having the structure 
as shown in Rg 18 to a personal computer or a work station, or pre-installing a circuit having tiie structure as shown 
Fig- 18 in a television receiver, a video player or tiie like. Furtiier, by combining tiie present apparatus witii a camera, it 
is also possible to shoot an object separately from its sunroundings in order to capture many viewfinder images seen 
from diffwenl points, so as to produce a catalogue containing 3-D pictures of tiie object With tiiis way of shooting, deptii 
measurement by means of a laser, infrared rays, or supersonic waves, vwhich have been conventionally necessary, are 
no longer necessary. 
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Contrary to Embodiments 1 and 2. in which a monocular camera shoots an object, in Embodiment 3. a multi-eye 
camera system is used to capture a stereo viewfinder image The captured stereo viewTmder Image is used for the ae- 
ation of an Image for a 3-D display In the following, a method for such an image creation is described mainly in view of 
the difference from Embodiment 1 x i 

Fig 27 shows mam stages through which an inrage for a 3-D display is created. The difference from the stages in 
Fig . 4 of Embodiment 1 is the following 

1 At Stage 1 1n Enlbodiment 3, dlgjlacement information is extracted instead of the motion information in En^od- 

''"^ While different'time frames are processed in Embodiment 1. same-time frames are mainly processed here in 
Embodiment 3 . Between frames shot at the same time, the movement of an object cannot be defined. Thus, infor- 
mation about the object displacement between such frames is extracted instead. 

2 Stage 2 in Fig 4 is unnecessary in FHg . 27 
Fig. 27 does not include a stage corresponding to Stage 2 in Fig. 4 {Calculation of a 3-D IVIotlon Information) 

because the distance between cameras is already known as shown in Rg 13 and depth information can be 

obtained according to the prindpie of triangulatlon ising the distance 

When inaccuracy could be caused with respect to tiie relative positional relationship between a pluraBiy of cameras 
of a multi-eye camera system, it is desirable to use selfcalibration fo correct such inaccuracy at Stage 2. Methods for 
selfcalibration aredescn-faed in references such as ''Self-Calibration of Stereo Cameras" by Tomita and TakahasW, Jour- 
nal of the Information Processing Society of Japan, Vol. 31. No.5 (1990) pp.B50-659. JP Laid-Open No. Hei 2-138671 
25 and JP Laid-open No Hei 2-138672. 

Stages 1 to 3 of Embodiment 3 will next be described 

[Stage 1) Extraction of 2-D Displacement Information 

30 m addition to the substitution of motion information witii di^lacement information. Frames t and f are replaced by 
Frames 1 and 2, which are respectively shot by Camera 1 and 2 at Time i In Embodiment 3. it is possible to create a 
final image based on a minimum of only two frames, which are shot at the same time, that is, Time t. In other words 
when shooting using a multi-eye camera, the viewf indar image captured may be a still bnage. Stage 1 is further different 
from Embodiment 1 as follows. 

(1) In Step 11 of Embodiment 1 (Setting a Corresponding Point Candidate Area), the amount of calculation is 
reduced with the appropriate selection of different-time frames or the limitation of a con-esponding point candidate 
area which are conducted based on the Intensity or the trsuls of the movement of a viewfinder image In Embodi- 
ment 3. on the other hand, a different method from that in Embodiment 1 is employed as desaibed in the following 

40 to limit a con-esponding point candidate area for the same piirpose 

It is assumed that a multi-eye camera is positioned horizontally, as is usually the case. Y coordinates (a vertical 
coordinate) of corresponding poinls in frames shot by cameras of the multi-eye camera system are substantially 
ttie same as one another. TaJdng this into consideration, as well as en-orsdue to image processing or camera instal- 
lation a con-esponding point candidate area can be limHed to a horizontally longitudinal band area. Moreover, it is 

45 assumed that Frames r and 2' are shot at Timet' and Frames 1 and 2 at Time t. respectively, wherein t^M When 
a positional difference of the r^resentative points between Frames r and Z is x, it can be predicted that con-e- 
sponding point candidate areas in Frames 1 and 2 be set so as to have the same difference of x. or thereabouts, 
between each other In other words, the corresponding point candidate areas in Frames 1 and 2 can be limited to 
the regions, the difference between which is about x. 

(2) Although statistical analysis is introduced for a slow movement in Step 12 of Embodiment 1 (Calculation of non- 
similarity in the area for candidates for a corresponding point), this analysis is unnecessary In Embodiment 3 

(3) Similarly to Step 12 in Embodiment 1. block matching Is introduced in determining positions of corresponding 
points in Embodiment 3.. However, in Embodiment 3. biased btock matching may be more effective than simple 
block matching in some cases, such as wrtien tiie multi-eye camera system to be used Is constituted of cameras 
with different characteristics. For example, if Camera 2 tends to produce more bluish images tiian Camera 1 . the 
color density of Frame 2 should have its blue components (B) subtracted to a certain extent (that is, a color deflec- 
tion constant ag) before undergoing block matching Without such an adjustment, there is a nsk that the meaning 
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of E3 for combing El and E2 may become invalidated- An example will be taken where acolor density is expressed 
in red. green and blue spaces In such a case, not only blue (B). txjt also red (R) and green (G) should undergo 
such an adjustment thrcaigh stiDtracUon of color deflection constants an and og. respectiv^y Note that the biased 
block matching evaluates the similarity, based on a squared difference of gray-scale level. This means that the smv 

5 tlarity can be treated as a distance in the color space, which is the same metric as is used for relative positional 
acceptability of viewfinder images. Therefore, the similarity and the acc^tatsility can be combined together and can 
be used for the matching evaluation 

Referring to Fig 7 and based on Equation 1 . biased bloc^ matching will be described using equations Pt(i J) in 
Ent^odiment 1 is denoted as P1 and P2 respectively corresponding to Frames 1 and 2, and lt(i,j) is as II and 12 

10 Since Equation 1 can be simplified to be expressed as Equation 18. Equation 18 can be used in normal block 
matching with respect to a gary-scale image 

E1»2:Z{l1(P1x+u.P1y+vH2{P2x+u.P2y+v)}^ [Equation 18) 

15 On the other hand, biased t3lock matching is represented by folJowing Equation 19. which is a modification of 

Equation 18 

E1=i:E{l1(P1x+u,P1y+v)-l2(P2x+u,P2y-(-v)-a)^ [Equation 19] 

so For a color image, with a being any one of a^* a^. and 03. Ei is calculated for all viewfinder images in all RGB spaces, 
so as to obtain the total thereof, that is. El p El ^ + £1 q . which is used in btock matching. For simplicity, Equation 
19 can be ejqaressed as Equation 20. with 11 and 12 representing l1(P1x+u. Ply+v) and l2(P2x+u. P2y+v) . respec- 
tively,. 

25 E1=2:X(l1-l2-a)^ [Equafion20] 

wherein 11 and II are functions of u and v. respectively, and a is a constant 

The optimum value of a is obtained next Since Cameras 1 and 2 shoot the same object viewfinder images 
captured by both cameras should comprise substantially the same content, except for the displacements of tiie 
30 respecth/e image parts In other words, the more similar the characteristics of the cameras are. tiie smaller the 
value of El in Equation 20 becomes. Based on tills fact it is known that a shouki be a value whk^ can n^rtimize 
the value of El Since Equation 20 can be expressed as Equation 21 , 



E1=X5:{(l1-l2)^^2a(l1-l2)+a^) [Equation 21] 

^ =x:s(ii-i2)^-2a2:x(ii-i2)+rra^ 

provided tiiat the total number of pixels in a blodt Is N. Equation 21 is furtiier expressed as Equation 22. for ££1 ^ N 
40 E1=Xi:(l1-l2)'^-2a2:£(n-l2)+Na^ [Equation 22] 

Therefore, since 

dE1/da=-222(ll-l2)+2Na 

4$ 

is held, the value of El is minimized when Equation 23 is held. 

a={X2:{l1 -I2))/N [Equation 23] 

5D Since a can be understood as an average difference value in color densities of respective pixels between two subject 
areas of block matching, a substitutkin of Equation 23 into Equation 22 would lead to Equation 24. 

E1=r2:(J1-l2)^-{i:i:(l1-l2))^/N [Equation 24] 



55 Therefore, it is concluded that Equation 24 is used for biased block matching With tiie introduction of Equation 24. if it 
is assumed tiiat Cameras 1 and 2 shoot exactly the same object, the value of El becomes zero. On the otiier hand, the 
value of E2 also becomes sutsstantlally zero . Therefore, it is understood tiiat biased block matching is effective in efim- 
inating an initial error caused by the judgement as to the simtiarHy of viewfinder images. Aftenwards, tiie best matching 
will be seardied through tiie same process as that in Embodiment 1 . 
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It is to be noted that a color density other than an RGB density, such as an HVC density, may also be applied 
without a prdalem for block matcWng. Moreover, block matching may be carried out based on a color difference, 
that is a residual difference, instead of a squared difference of gray-scale level. When a con-ection value a. which 
has been determined by Equation 23, exceeds a predetennined value range, a biased block matching be dis- 

c continued It is necessary to provide such a maximum limitation value, without which, sometimes the block match- 
ing may detect an incorrect corresponding point because the block including a point at issue has accidentally the 
similar pattern although rt has a quite different color However, since the color difference caused by camera char- 
aaeristics is generally not very large and therefore is wrthin a predetemiined limitation range, an introduction of 
such a limitation value would be useful and practical 

ic With the biased block matching discontinued, normal block matching may be used to evaluate the similarity of 

viewfinder images Alternatively, the value derived from the biased block matching m^ be used, after correcting 
image parts only at the upper lirrtted value of a correctable range (hereinafter referred to as T). which can be com- 
puted with the following equation. 

,5 Ei=zi(ll-12)^-{J:2:(I1-I2))^/N+Nx^ 
wherein x=:2:r{n -l2)/N!-T 

(4) In Step 13 of Embodiment 1 (Deternrtination of an Initial PosWon of a Corresponding Point), a point wrth a stable 
so movement among different-time Frames t, V is further selected as a characteristic point. In Embodiment 3, addi- 
tional criteria are conskiered for the selection In Fig. 28, Frames 10 to 12 constitute different-time frames to one 
another shot by Camera 1 , while Frames 20 to 22 constitute different-time frames shot by Camera 2. Two frames 
shown side by side in Fig. 28 constitute same-time frames as each other. While directing attention at Point P in 
respective frames, its movement between the different-time frames is expressed with a vector An (n beii^ a natural 
25 number), and its movement between the same-time frames with a vector Bn. 

When set as described above, a point which meets the foHowing criterion will be selected as a characteristic point. 

(a) vector Bn is substantially consistent or moves substantially consistently 
In addition to the above criterion (a), the following criterion (b) may be added, so as to select a point which 

meets the both criteria as a characta-istic point. 

(b) vector An is sutjstanlially consistent or moves si±>stantially consistently. 
Criterion (b) corresponds to the condition introduced in Embodiment 1 As described above, when a shooting 

with a multi-eye camera system, it is possible to obtain depth infonnation from same-time frames only For tiiis. it is 
necessary to obtain a conrect corresponding relationship between viewfinder images. In obtaining the correct cor- 
responding relationship, information obtainat^e from different-time frames is encouraged to be allowed m addition- 
Since it is considered as having been accurately traced, a point which simultaneously meets the above two criteria 
will provide key information in the extraction of 2-D displacement information. When a camera captures a stin view- 
finder image, the known dynamic programming may be applied to obtain corresponding pants. 

[Stage 2] Acquisition of Deptii Information 

Depth information is calculated based on the di^lacement of reactive image parts, which has been obtained at 
Stage 1. In multi-eye shooting, where tiie srtuation of Fig. 13 is achieved at time t, d^ information can be obtained by 
45 the metiioddiscloseid at Stage 3 of Embodiment 1. 

It is to be roted tiiat, since respective cameras of the multi-eye camera system are situated having a fixed relation- 
ship with one another, assumed that tiie relationship among Uiem and their magnification rates (a focal distance) are 
known, depth information in a real (absolute) value can be obtained, indudirig a scale factor k, which can not be deter- 
mined in Embodiment 1 . 

so 

[stage 3] Creation of an Image 

An image is created through the same process as tiiat at Stage 4 in Embodiment 1 (Creation of an Image) 
In Embodiment 3. as described In the above, a camera receives a stereo viewfinder inage and outputs an image 
£5 for a 3-D display. Therefore, tiie viewfinder image captured by ttie camera will be precisely reproduced for output, in 

addition to the fact that a desired image can be created through image processing, including an enhanced display, as 

described in Embodiment 1 . 
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Embodiment 4. 

A technique for creating a viewf Inder image seei from a different point, by utilizing a mouse and its cliddng action 
is described In Embodiment 2 In the folfowing. examples will be described, where various viewfinder images seen from 
5 different points are created for a variety of purposes 

As descrifc)ed above, according to the present Invention, it is possible to create a viewfinder image seen from a dif- 
ferent point without moving a camera In this case, naturally, a viewfinder Image seen from a viewpoint which is hypo- 
thetically located a shorter distance from the actual viewpoint will result in greater accuracy. By utilizing this fact the 
following applications can be achieved. 

10 

1 The creation of a viewfinder image with multi-viewpoints, based on a viewfinder image shot by a douWe-eye cam- 

When a stereo viewfinder image is available with a double-eye camera system, a viewfinder image with multi- 
vlewpofrits will be created by hypotheticaBy providing a third camera. In other words, a point at wHch the hypothet- 

15 icai third camera is placed is determined so that the third camera is set apart from the other cameras by a small 
space . Then, a viewfinder image seen from the thus determined point is created The thus created image, which is 
relatively accurate, and two viewfinder images actually captured by the two cameras of the double-eye camera sys- 
tem are comtened together, so as to create a good viewfinder image with multi-viewpoints. Subsequently, additional 
considerations of depth information would permit the creation of an Image for a 3-D display, which is seen from any 

20 of the multi-viewpoints. 

2. Creation of a Viewfinder Image in Slow Motion 

The closest two different-time l=rames in terms of time are designated respectively as Frame t and t': viewpoints 
of Frames t and t' are designated respectively as Viewpoint t and t\ Although the viewpoint is actually changed from 
ss Viewpoirtf t to f from Frame t to f , no viewfinder image between them is availaWa Therefore, by providing a hypo- 
thetical viewpoint between Viewpdnts t and f. a viewfinder image seen from a different point, that is. a pdnt 
between Viewpoints t and t' in this example, is newly created A plurality of viewfinder Images seen from different 
points are created in tiiis way Then, a sequential display of such viewfinder images would present a viewfinder 
Image in slow motioi. which has the following effects. 

30 

a. The movement arrang respective vievwfinder Images becomes snraoth, instead of an original flickery move- 
ment ^ . ^. ^ 
b With a smaller movement of a vievi/point between closer frames In terms of time, the qudity of the viewfinder 

image between those frames is not degraded 
35 c. Variation in a path where a viewpoint moves from Viewpoint t to f would provide a different effect on the view- 

finder Image in slow motion. 

Additional considerations of depth informaticwi would permit the aeation of an image for a 3-D display It is to be 
noted that the above mentioned. technology can be applied to same-time frames wifliout problems 

40 . 

EnTbQdlmei^tq, 

Bnbodiment S. which is substantially the same as Embodiment 1 expect that it outputs an image fbr a 2-D display, 
aims to practice the following image procesang Using depth information. ...... .-^ 

45 

1 . Change in a Viewpoint 

Accompanying tiie hypothetical change of a viewpoint, a viewfinder image should also be varied Accordirig to the 
present invention, when the viewpoint is hypotheticaliy changed, a viewfinder image seen from the changed viewpoint 
so is automatically created wNle the camera is k&pt fixed. 

2. Partial Expansion or Conpression of an Image 

By utilizing depth information, tiie most natural and effective viewfinder image is automatically created through par- 
55 tial scaling as required. 
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3. Sepeuation of an Image Area 

For sararating a desired image area, it is first necessary to fully recognize respective Image areas. For area rec- 
oonition several methods have been proposed, including a clustering method, but they have had only unsatisfactory 
5 r^ts. The present invention permits an accurate area separation in a totally different way from the conventional way. 

using droth information. . ^ , „ ■ ■ 

Since depth informafion is obtained through the same process as that In Embodiment 1 . in the following, or^y Stage 
4 (Creation of an Image) will be described, as it differs from Embodiment 1 . 

10 [Stage 4] Creation of aa\ Image 

A desired viewf Inder image is created according to the d^ information, which tias been obtained at Stage 3 At 
stages up to Stage 3. at least two viewf Inder image frames have been demanded in extracting necessary information, 
though it is possible to create a desired image based on only a sfrigle viewf inder image frame at Stage 4. 

IS 

(1 ) A Viewf inder Image Seen from a Different Point 

Figs 29 and 30 show a coffesponding relationship between an original viewfinder image and one re-created so as 
to be seen from a changed viewpoint. Fig 29 is the original viewfinder image, showing a tree, a house, and a person, 
each having a smaller depth in tWs order Fig 30 is the viewfinder Image created with the assumption that its viewpoint 
is hypothetically moved to a point somewhere at the top right of the 6c«ie 

As is apparent from those drawings, according to the present invention, it is possible to obtain a viewfinder image 
seen from a different point while the camera is fixed, because 3-D information about the respective Image parts has 
been knoiwn. including the depth mtormation. ftom Stage 3 In this example, it was assumed that the viewpoint was 
moved up to the top right of the scene, although it can be understood that the object was moved down to the bottom left 
of the scene The movement to the bottom left can be expressed in the form of translation and rotation movements, as 
desaibed in Stage 3 By reversely following the processes at Stages 1 to 3. it is possible to compute a 2-0 movement 
of the object on the screen, based on iUs hypothetical 3-D movement of the object, so as to weate the vievurfuTder image 
shown in Fig . 30 . Since no room is left for aibitrariness in the creatton through Stages 1 to 4. the thus aeated viewrmder 

30 image is very natural- . ^ . • ■ 

In tWs stage, ft is preferable to consider and reflect a masWng relatiwiship m wealing an image. Concretely speak- 
ing in Rg 30 for example, accompanying a change of the viewpoint, the bottom part of the tree becomes obscured by 
the roof of the house. Therefore, for creaUng a natural viewTmder image, the bottom part of the tree should be covered 
by the image data of the house In actoal software processing, the creation should be started with an image part having 

3S a larger depth in order to create a natoral viewfinder image Alternatively, the Z-buffer technique, which is widely-used 
in computer graphics, can be used for this purpose For obtaining a masking relationsHp through computation, a judge- 
ment is first made as to'»rtie«ier the sight vectors directed at the respective image parts from the changed viewpoint 
are overlaid on one another. When the sight vectors to Parts A and B are overlaid on each other, and Part A is tocated 
closer to the viewpoint than Part B, it is known that Part A should be seen as masking Part B. An image may be created 

40 based on the thus computed information. 

(2) Partial Scafing of an Innage 

In an enhanced display, one of the Image display technle^es. a ctoser ol^ect may be re-positioned even doser, 
4S while a distant object may be made even more distant, so that the contrast in dearth is empha^zed between the objeds. 

For this image processing, according to the present invention, images are partially changed its scale based on 
depth Information. Rg 31 shows the same viewfinder image as Fig 29, except that a part of it is magnfied. As a result 
of the person being expanded, the person having the shortest depth among all objects in the drawing, the person is per- 
ceived as being much closer to the viewer. As a result, an eHective ertianced display is achieved. In this case, prefera- 
so biy. a masking relationship is also reflected in the newly created viewfinder image. 

It is to be noted that, in expanding an area wKh the shortest depth, there is no limitation in the magnifying ratio 
because that area can be expanded witt^out a problem until it is perceived as having no depth at all. However, in mag- 
nifying an area having around a middle depth in the viewfinder image, that is. the house in Rg. 31. since it should not 
be perceived as being closer than the person, the magnifying ratio is accordingly restricted. Violation of such a restric- 
55 tfon wouW result in an unnatural viewfinder image In the expansion acconding to depth information, as is executed in 
the present invention, it is possible to make condifions such as that only areas with the shortest depth should be mag- 
ntfied. and only ones wHh the largest depth should be reduced, so as to create a natoral and reahstic image, that is. an 
image in compliance with the laws of nature. 
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In the above, methods have been described for creating a natural image, though unnatural images may sometimes 
be demanded, such as when the sense of unnaturalness needs to be emji^asized with a larger d^Iay of a dts^nX part 
than thai of a closer part Such an unnatural image may be used for games or the like, in any case, according to the 
present invention, the naturalness or unnaturalness can be freely created as desired Conventionally, a natural image 
5 may or may not have been created as a result of an accident where the scale of some parts of the image were changed. 
However, according to the present inventicvi. the aeation of a natural or an unnatural image is ensured as requested 

Once a natural image is created, in order to further carry out the above described process (1) or a process (3) to 
be described later on the natural image created, it is ^eferaUe to begin the process by changing the depth of the 
expanded or compressed areas For example, when an area is doubled in sze. its depth should be halved. Contrary, 
u' when an area is halved in size, its depth should be doubted This correction is necessary because the size of an area 
is inversely prc^ortional to its depth An image so corrected would ensure a natural image to be produced in the sub- 
sequent processes 

In conducting the image processing of the above (1) and (2), an image may be finalized by smoothing uneven parts 
along the edge of the image For example, when re-creating the viewfinder image in Fig 20 into that in Rg. 30. it will 

r5 never happen that all image parts in Rg. 29 correspond to those in Fig. 30 with a one^byone corresponding relation* 
shrp Concretely speaking, since the space shown at the top right corner of the image in Rg 30 may show (Ejects not 
seen at the same region of the image in Rg. 29 Therefore, in a naive creation of a viewfinder image of Rg 30 based 
on that of Fig 29. an image part to be shown at the region is actually broken off in Fig 30. This break-off causes a 
recess with respect to the ideal edge line of the image. For tiie same reason, all image parts included in Rg. 29 are not 

ao shown within the ideal edge of the image in Rg. 30, while some inrtage parts are projected from the edge. 

In order to solve this problem and to maintain the original screen shape (rectangular in this Bxam^e), such 
recesses are filled with extra pixels, while such projections are cut off with redundant pixels. The filling should be made 
with pixels having the same color as that of irriages in the adjacent region. When the above image processing (2) 
causes simitar unevenness along the edge line of the image, a similar amendment would solve the probleiTL With this 

25 amendment, the edge line of the image is displayed natural. 

(3) Separation of Image 

A desired image part is separated to be individually processed Referring to Rg 29. it is assumed that the person, 
30 the house and the tree respectively have a depth of 3m. 10m. and 20m. In order to separate the person only, a condition 
is made such as that *'within five meters in depth" before starting the detection and judgement of the depths of the 
respective parts In order to separate the house, the condition may instead be such as *HAriihin five to fifteen meters in 
depth " 

Rg 32 is a viewfinder image created with the house s^arated from Fig. 29. After separating a desired image area, 
35 the rest may be left blank, or tine separated area may be pasted on a different viewfinder image. 

As described above, the present invention provides methods for image recognition and processing as well. Con- 
ventionally, image areas have been separated manually or by means of a clustering method using colors The present 
invention provides a method for achieving an accurate area recognition in a totally different way from the conventional 
method, using depth information 
40 As described thus far. the present invention discloses a method for image processing using an accurate depth infer - 
mation Since ^e series of processes can be fully automated with software, the present invention can be applicable 
over a wider range 

EmbQdimQptg,. 

45 

An appropriate apparatus in practicing Embodiment 5 will be described, which is substantially the same as that 
described in Embodiment 2, except that it outputs a single type of image, instead of two types of images, that is, images 
for right and left eyes in Embodiment 2. The operation will next be descr3:>ed only in connection with structural differ- 
ences from Embodiment 2. A camera shoots an object so as to capture its viewfinder Image, which is si^pfied via the 

so image* irput circuK 20 to be stored m the frame memory 24. A plurality of frames on a viewfinder image are read out 
from the frame memory 24 to be processed through the corresponding point detection circuit 26 and the moSon detec- 
tion circuit 30, for obtaining depth information of the object. 

Subsequently, the image creation circuit 32 creates an image, such as a viewfinder image seen from a different 
point according to the depth information. In this case, with an instruction supplied via the instruction input section 34. 

55 various processes will be carried out. including ^e creation of a viewfinder image seen from a different point, expan- 
sion, compres^on or separation, as described in Embodiment 5. 





SNStXXJiD; «EP__07aS512Aa_L> 



24 




EP 0 735 512 A2 



Embodiment 7. 

A method for creating an image for a 2-D display >when receiving a stereo viewfinder Image will be described. 
The difference between Embodiment 5 and 7 is the same as that between Embodiment 1 and 3 According to 
5 Embodiment 7, it is generally possible to obtain depth information with a high accuracy, and thereby achieve a Nghly 
accurate creation of a viewfinder image seen from a different point as a finai inage 

Embodiment 8. 

10 Similarly to Embodiment 1 . a method for displaying a good stereoscopic image» basaJ on information about a depth 
and about a 2-0 image, will be described . Embodiment 8 is different from Embodiment 1 in that it considers conditions 
unique to a display apparatus, when displaying images. 

As disclosed in foregoing JP Piftilication No Sho 55-36240. when given depth information, it is possible to create 
and display a stereo image, based on a 2-D image That is. the sender transmits television image signals {2-D image 

15 signals) appended with depth information The receiver, on the other hand, divides the received image signals into two 
groups Then, the respective image parts of one of the two image signal groups are given some displacement according 
to the depth information, so as to create right and left eye images, respectively. The thus created images are displayed 
on a stereo image display apparatus, to achieve reproduction of a stereoscopic image 

In this case, it is necessary to consider the nature of a parallax. In other word, as has already been discussed in 

so the above, since a parallax is t>ased on an angular difference between sight vectors, the extent of which varies among 
display apparatus in different sizes by displacing even the same number of pixels, the parallax resultantly varies 
depending on the size of the display apparatus Even assuming that the size is the same, the parallax still varies 
depending on the distance between the display apparatus and the viewer. Therefore, in order to embody the optimum 
3-D effect, the extent of displacement should be determined indi\ridually according to the unique condtions of a display 

25 apparatus. 

In En^iodiment 8. the correction value unique to each sta^eo Image display apparatus is introduced in addition to 
depth information Fig 33 shows a sfructure of a stereo image display apparati© according to Embodiment 8. in which 
a 2-D image and depth information are supplied via an input terminal 100, and the latter is extracted in a depth informa- 
tion extraction circuit 1 02 by a known method, 
30 The 2-D image, on the other hand, is divided into tvw) groups. One is supplied to a buffer memory 104. and the other 
is supplied to a right eye image displacement circuit 1 06. The buffer memory absori^s a delay caused in the displace- 
rhent circuit 106. A Left eye display panel 1 08 displays an image transmitted from the buffer menwry 104, while a right 
eye display panel 110 displays an Image given a displacement in the displacement circuit 106. 

This apparatus is characteristic in that the displacement circuit 106 determines the extent of displacement with ref- 
35 erence to not only depth information but also to parameters unique to the apparatus, the parameters being pre-stored 
in a ROM 1 12 The ROM 112 stores the optimum correction value for the apparatus, which is in compliance to the fol- 
lowing general rules. 

(1) Relating to the size of a display panel: for a snnaller display panel, a larger value Is stored. 
40 (2) Relating to the distance from a display panel to a viewer in an ordinary use: for a smaller distance, a smaller 
value is stored. 

The display circuit 105 gives a larger cfisplacement if a depth is smaller ora correction value is larger, the correction 
value being predetern^ed according to the above mentioned rules. As a result, the optimum stereoscopic display is 
45 achieved, which reflects the conditions unique to the display apparatus. 

Embodiment 8 vney also be affiled with the following technical variations. 

1 . As shown in Fig 33, a volume 1 1 4 may be provided for manually changing the extent of displacement, so that a 
supplementary adjustment or an adjustment according to a personal preference can be performed. 

so 

2. Displacement may be given to both right and left images. 

3. As is described in the foregc»ng Nikkei Electronics No 444. a S-D effect may be created using a Pulfrich effect 

£5 4. The ROM 1 12 may have pre-stored therein a plurality of correction values, so as to be selected for use depend- 
ing on the sttuat'ron 

5 Conedion values may be stored in two groups, one being changed depending on the screen size, the other 
being changed depending oh the distance between the display and the viewer 
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6. In the description above« the value of depth information is accurately proportional to the distance between the 
shooting position and the object However, depfli information may show the absolute distance between the shoot- 
ing position and the olsject. 

5 A simple case will be taken as an example, where depth Information comprises only three distinctions: that is. large, 
medium, and small When depth information indicates large" or a long distance, no displacement is made to cause no 
parallax to the image part. When "medium" or a medium instance is indicated, some d^lacement is made to cause a 
parallax to the image part When "small" or a small distance is indicated, a large displacement is made to cause a large 

parallax to the image part 

10 The thus ^nr^^lrfied depth information could reduce the volume of transmission data for broadcasting, in addition to 
achieve a circuit having a simple structure for a stereo image display apparatus. 

Claims 

IS 1. A method for creating an image for a 3-D di^Iay. conprising: 

a step of extracting depth information from a 2-0 motion image; and 
a st^ of creating an image for a 3-D display according to the depth information. 

A method for creating an image for a 3-D display according to daim 1, wherein the step of extracting depth infoi- 
matioh comprises: 

a step of detecting a moven^nt of the 2-D motion image; 

a step of calculating a relative 3-D movement between a scene and a shooting \^ewpoint of the 2-D motion 
image; and 

a step of calculating relative distances from the shooting viewpoint to respective parts in 3-D space, which 
are projected in 2-D motion image, based on the relative 3-D movement and the 2-D movements of the respective 
image parts 

A method for calculating depth information, based on frames included in a 2-D motion image, comprising: 

a step of selecting two frames witii an appropriately large movement between them from the 2-D motion 
image; and 

a step of calculating the depth information, based on the two frames. 

A method for calculating depth information according to claim 3. further comprising: 
a step of providing a plurality of r^resentative points in a referaice frame; 

a step of deternrnhing a plurality of corresponding points in another frame, so as to corr espond to each of the 
representative points; and 

a step of obtaining a positional relationship between the representative and conesponding points, 
wherein 

a position of a corre^onding point in another frame is predicted according to the positional relationship 
between the r^resentative and conesponding pidnts, so as to limit an area tor searching the corresponding point 
in another frame 

5. A method fbr calculatir^ d^th information according to claim 3. further comprising: 
a st^ of providing a plurality of representative points in a reference frame; 
45 a step of determining a plurality of corresponding points in another frame, so as to correspond to each of the 

representative points; and 

a st^ of obtaining a positional relationship between the representative and corresponding points; 
each of the representative points being classified either as a characteristic or a non-characteristic point; 
wherein 

so when more than a predetermined number of characteristic points move l>etween the reference and other 

frames, such tiiat a total of their movements exceeds a predetermined value, H is judged that a movement between 
the reference and the other frame is appropriately large, and the reference and the other frames are therefore 
selected. 

55 6. A method fbr calculating deptii information according to claim 3. further comprising: 
a step of providing a plurality of representative points in a reference frame; 

a st^ of determining a plurality of corresponding points in another frame, so as to correspond to each of the 
r^resentative points; and 

a step of obtaining a positional relationship between the representative and corresponding points; 
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the representative points being classif ied either as a characteristic or a rK)n-characteristic point. 

wherein ^ . . 

when more than a predetermined number of the characteristic points move between the reference and other 
frames such that a variance of their movements exceeds a predetemiined value, it is judged that a movement 
between the reference and other frames Is appropriately large, and the reference and other frames are therefore 



selected. 



7- A method for calculating d^h information according to any one of clainrts 3 to 6. wherein 

the calculation of the deptfi information is discontinued when the two frames with an appropriately large 
10 movement between them cannot be selected from the 2-D motion image. 

8. A method for calculating depth information according to any one of claims 5 to 7. wherein 

a corresponding point of a representative point relating to an image area having a geometric characteristic, 
is adjusted to be positioned such that an Image area related thereto retain the geometric characteristic. 

15 . 

9. A method for calculating depth information according to claim 8. wherein 

tiie image area having a geometric characlerlstic is an area including a straight line 

10. A method for calculating depth information, conprising: 

so a step of providing a plurality of representative points in a reference frame; 

a step of conducting evaluation of slntilarity of images between an inmge area inctixfing specific points 
which are arbitrarily set in another frame, and a nearby image area Including the representative points in the refer- 
ence frame; 

a step of evaluating relative positional acceptability among the specific points; 
zs a step of determining the specific points as con-esponding points of the representative Points when both 

evaluations provide favorable r^ulls; n 
a step of conducting a search for a best point where both evaluations provide a most favorable results, while 
moving one of the corresponding points with all the other corresponding points fixed at th©r curent positions; 

a step of conducting a positional change of tiie one of tiie conresponding points to tiie best point, which was 
30 found during the search; „ ^ 

a step of sequentially conducting the search and positional change with respect to all of the con-esponcfing 

points; and ^. ^ ^ ^, 

a step of calculating the depth information according to a positional relationship between the representative 
points and tiie con-e^onding points, the conresponding points having been d^erntined through a series of the 
35 above Steps 

11. A method for calculating depth information according to claim 10, wherein 

after the search and positional change is conducted for all of the con-esponding points, positional accuracy 
thereof is improved by solving an Euler-Lagrange differential equation indicates a condition where a combined 
40 value of both of the evaluations is an extrenujm. . 

12* A method for calculating depth information according to any one of claims 10 and 11 , wherein 

tiie evaluation on similarity of images is conducted by biased block notching where tiie similarity is correctly 
evaluated to be highest when tiie blocks Including the identical object are tested, regardless of shooting conditions. 
45 and 

relative positional acceptability of bnages Is evaluated using tiie function of distance between neighboring 
image parts, and. ^ i 

tiie results of tiie above evaluations are ti-eated in terms of distance in tiie color space and the pixel space 
respectively so that they can be combined together and can be used for tiie evaluation in determining con-espond- 
so ing points. 

13. A method for calculating depth information according to claim 12, wherein 

the evaluation of similarity of images is conducted by biased block matching within tiie liniited correction 
range, which has been pre-determined. 

55 

14. A method for calculating d^h information, comprising: 

a step of providing a plurality of representative points in a reference frame; 

a step of determining a plurality of con-esponding points in anottier frame, so as to correspond to each of tiie 
representative points; and 
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a step of obtaining a positional relationstvp between at least a characteristic point among the representative 
points and its corresponding point, 
wherein 

a point whose position moves steadily among a plurality of frames shot at different times is selected as the 
5 characteristic point. 

15. A method for calculating depth information, conprising: 

a step of providing a plurality of representative points in a reference frame; 

a step of determining a plurality of corresponding points in another frame, so as each to correspond to each 
70 of the representative points; and 

a step of obtaining a positional relationship between at least a characteristic point among the representative 
points and its corresponding point, 
wh^ein 

a point is selected as the characteristic point, whose displacement is substantially consistent between 
15 frames simultaneously shot, and further substantially consistent or changes sut^stantially consistently In between 
other frames simultaneously shot at a close but different time 

16. A method for calculatir^ depth information* conprising: 

a step of providing a plurality of representative pcunts in a reference image; 
20 a step of determining a plurality of corresponding points in another image, so as to correspond to each of 

the representative points, respectively; 

a step of obtaining a positional relationship between the representative and corresponcfing points; and 

a step of calculating depth information according to the positional relationship. 

wherein 

25 the calculation of the depth infoi malion is discontinued when less than a predetermined number of c^arac* 

teristic points are selected from the representative points. 

17. A me^od for calculating d^th information according to claim 16. whereh 

the r^resentative and corresponding points are respectively provided in two frames which are included In a 
30 2'D motion Image 

18. A method for calculating depth informab'on, based on two frames included in a 2-0 motion inrmge, wherein 

the calculation of the depth irrformation is discontinued when movement is small between the two.frames. 

35 19. A method for calculating depth information of a 2-D image, 
wherein 

when a depth of any point in a certain image is calculated as negative, the depth is interpolated by using 
depth information of pdnts close-by having positive depth values 

40 20. A method for image processing using depth information, comprising, 

a step of creating a stereo image giving a paraEax to a 2-D Image according to Its depth tnformaticHi. 
wherein 

file parallax is transformed so as to fell within a predeternvned range, so that the stereo image is created 
according to tie transformed parallax. - . -v.. . .r= : . • 
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21. A method for image processing according to claim 20. 

wherein 

the parallax is linearly compressed so as to fell within tire predetermined range having a point arbitrarily 
determined as a middle value of the predetermined range. 

22. A method for image processing according to claim 20, 

wherein 

the parallax off the predetermined range is unHormly transformed to be a doser value of either an upper or 
a lower limitation value of the predetermined range. 

23. A method for image processing according to cl^ 20. 

wherein 

the parallax is non-linearly transformed such that transformed values smoothly converge to upper and lower 
limitation values of the predetermined range, to thereby tall within the predetermined range 
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24. A method for image processing using depth information, comprising. 

a step of creating a stereo image by giving a parallax to a 2-D iwage according to its depth information, 
wherein 

the parallax is variable, which is originally determined according to the depth information 

s 

25. A method for image processing using depth information, comprising: 

a step of creating a stereo image by giving a parallax to a 2-0 image according to its depth information: and 

a step of displaying the stereo image on a stereo image display apparatus, 

wherein 

,0 a process to be conducted on the 2-0 image so as to cause the parallax is determined according to a display 

condition unique to the stereo image display apparatus 

26. A method for image processing according to claim 25, 

wherein 

IS the display condition is determined according to a size of a display screen of the stereo image display appa- 

ratis and an assumed distance from the display screen to a viewer, and 

the process to be conducted on the 2-D image so as to cause a desired parallax is detenroned individually 
based on the display condition thus determined 

20 27. A method for image processing using depth information, comprising, 

a step of creating a stereo image by giving a parallax for every image part ol a 2-D image according to its 
d^th irtformation. ^ 
wherein 

an uneven image frame outline caused by the parallax which has been given is corrected 

25 

28. A method for image processing using dep&i information, comj^-ising. 

a step of creating a stereo image by giving a parallax for every image part of a 2-D image according to Its 
depth information, 
wherein 

30 a desired shape of an image frame is achieved by cutting off a peripheral part of the image. 



29. A method for conducting an image proc^sing to a 2-D image according to its depth infomnation, 
wherein 

an image area subject to the Image processing is determined, based on the depth information- 



wherein 

35 



30. A method for image proces^ng according to d^m 29, 
wherein 

the image processing is a process for changing a size of the image area 

40 31. A method for image processing according to claim 30, 
wherein 

the process for changing a size of the image area is a process for expanding a size of an image area with a 
smaller depth so as to be relatively larger than a size of an Image area with a larg©^ depth. 

4$ 32. A method for image processing according to claim 29, 
wherein 

the image processing is a process for conducting separation of a desired image area. 

33- A method for image processing according to claim 32, 
so wherein 

the separation is conducted wnth respect to an image area having a depth within a predetermined range. 

34. A method for image processing according to claim 32. 

wherein 

55 an image area which has been separated is conrti^ined with another image 

35, A method for conducting the image processing with respect to a 2-D image according to its depth information. 

wh^ein 
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Images with vieuvpotnts at a pluraiity of points on a hypothetical moving path, where a shooting point of the 
2-D image is hypothetically moved, are created for use as a slow motion image, based on the depth information. 
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